Skip to content

fix(typing): Narrow TypeVar used in Series#2347

Merged
MarcoGorelli merged 11 commits intomainfrom
narrow-series-type-var
Apr 7, 2025
Merged

fix(typing): Narrow TypeVar used in Series#2347
MarcoGorelli merged 11 commits intomainfrom
narrow-series-type-var

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Apr 4, 2025

Will close #2343

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

  • Changing IntoSeries gets us most of the way there
  • Settled on a bit of a creative solution

Comment on lines 394 to 399
# TODO @dangotbanned: Fix this one
# Current:
# `unstable_nw.Series[unstable_nw.Series[pl.Series]]``
# Goal:
# `unstable_nw.Series[pl.Series]`
assert_type(nw_series_passthrough, unstable_nw.Series[pl.Series])
Copy link
Member Author

@dangotbanned dangotbanned Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE

What we need is for from_native to return the same type on these branches

# Early returns
if isinstance(native_object, (DataFrame, LazyFrame)) and not series_only:
return native_object
if isinstance(native_object, Series) and (series_only or allow_series):
return native_object

Right now, this introduces a level of nesting that gets silly quickly

Type of "nw_series_nest_2" is "narwhals.series.Series[narwhals.series.Series[narwhals.series.Series[polars.series.series.Series]]]"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(6bd47b4) is a very heavy-handed fix.

My thinking is

  • Passing a narwhals object to from_native really should never have been supported
    • None of these are native types: nw.Series | nw.DataFrame | nw.LazyFrame
  • Removing that support would likely break someones workflow
  • Ignoring all flags requires only adding 1 new @overload per-class
    • None of those need to deal with recursion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was expecting the v1 backport (253da29) would surface some downstream issues - but all seems fine 🤔

class NativeLazyFrame(NativeFrame, Protocol):
def explain(self, *args: Any, **kwargs: Any) -> Any: ...

# TODO @dangotbanned: `nw.Series` **cannot** be allowed to match this!!!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NativeSeries is now what is used for IntoSeries.

An issue - that's new to me - is that Series also matches NativeSeries 🤦‍♂️

So for the original alias:

IntoSeries: TypeAlias = Union["Series[Any]", "NativeSeries"]

nw.Series can be a match in 3 positions

IntoSeries: TypeAlias = Union["Series[Any]", "NativeSeries"]
#                             ^--1--^ ^2^    ^-----3-----^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangotbanned literally an hour ago (#2347)

Should be the easiest to fix in (#2239)

🤦‍♂️🤦‍♂️🤦‍♂️🤦‍♂️🤦‍♂️

Comment on lines +132 to +133
@overload
def from_native(native_object: SeriesT, **kwds: Any) -> SeriesT: ...
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why does this work?

By having an unconditional @overload when a nw.Series is passed - it ensures no other @overload(s) are checked.

If we were to spell out all the valid cases, they would overlap with the NativeSeries cases (see #2347 (comment))

narwhals.from_native(native_series, allow_series=True)
narwhals.from_native(native_series, series_only=True)

Needing to have **kwds is unfortunate - but

practicality beats purity

I'm open to any other solutions that can pass the new tests 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we raise at runtime if kwds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we raise at runtime if kwds?

Yeah I was thinking about that as well!

I wasn't sure

  • what the message should be?
  • could that logic be shared for both v1 and main?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangotbanned dangotbanned marked this pull request as ready for review April 4, 2025 21:27
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned ! just one comment

Comment on lines +132 to +133
@overload
def from_native(native_object: SeriesT, **kwds: Any) -> SeriesT: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we raise at runtime if kwds?

Comment on lines +360 to +362
if kwds:
msg = f"from_native() got an unexpected keyword argument {next(iter(kwds))!r}"
raise TypeError(msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

This matches the runtime behavior of python.

import polars as pl
import narwhals as nw

df = pl.DataFrame({"a": ["A", "B", "A"]})
nw_df = nw.from_native(df)
nw.to_native(nw_df, keyword_1="a", keyword_2="b")

Only the first bad argument is ever included:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 39
     37 df = pl.DataFrame({"a": ["A", "B", "A"]})
     38 nw_df = nw.from_native(df)
---> 39 nw.to_native(nw_df, keyword_1="a", keyword_2="b")

TypeError: to_native() got an unexpected keyword argument 'keyword_1'

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned !

@MarcoGorelli MarcoGorelli merged commit e565223 into main Apr 7, 2025
27 of 28 checks passed
@MarcoGorelli MarcoGorelli deleted the narrow-series-type-var branch April 7, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Narrow TypeVar used in Series

2 participants