chore: Add EagerSeries*Namespace protocols#2294
Conversation
- Needs to work differently to `Expr` version
Using `.cat` as a test case, since it has only 1 method
Pushing to try out `3.8`& `3.10` bugs in CI again
- Regular `Protocol` failed on `3.8`, `3.9`, `3.10` - (f4c0cd6) - Testing if I need on **every** protocol, or just the one with an `__init__`
- Thinking this allows using the `Protocol` defs within `EagerSeries` properties 6a39d56
They were added before the package existed and don't belong in `utils`
| class _SeriesNamespace( # type: ignore[misc] | ||
| _StoresCompliant[CompliantSeriesT_co], | ||
| _StoresNative[NativeSeriesT_co], | ||
| Protocol[CompliantSeriesT_co, NativeSeriesT_co], | ||
| ): | ||
| _compliant_series: CompliantSeriesT_co | ||
|
|
||
| @property | ||
| def compliant(self) -> CompliantSeriesT_co: | ||
| return self._compliant_series | ||
|
|
||
| @property | ||
| def native(self) -> NativeSeriesT_co: | ||
| return self._compliant_series.native # type: ignore[no-any-return] | ||
|
|
||
| def from_native(self, series: Any, /) -> CompliantSeriesT_co: | ||
| return self.compliant._from_native_series(series) | ||
|
|
||
|
|
||
| class EagerSeriesNamespace( | ||
| _SeriesNamespace[EagerSeriesT_co, NativeSeriesT_co], | ||
| Generic[EagerSeriesT_co, NativeSeriesT_co], | ||
| ): | ||
| _compliant_series: EagerSeriesT_co | ||
|
|
||
| def __init__(self, series: EagerSeriesT_co, /) -> None: | ||
| self._compliant_series = series | ||
|
|
||
|
|
||
| class EagerSeriesCatNamespace( # type: ignore[misc] | ||
| _SeriesNamespace[EagerSeriesT_co, NativeSeriesT_co], | ||
| CatNamespace[EagerSeriesT_co], | ||
| Protocol[EagerSeriesT_co, NativeSeriesT_co], | ||
| ): ... |
There was a problem hiding this comment.
Now that these are in the right place, I can collect my thoughts on the 3 solutions we can use for Protocol definintions with an __init__.
Problem
Defining a constructor in a Protocol can produce two different bugs.
I've documented that in (#2064 (comment)) - but the short of it is 3.8 and 3.(9|10) have issues.
Solutions
1. Protocol38
This was the first one I discovered (for #2064) and I really dislike it due to the complexity.
"Fixing" 3.8 requires tricking the type checker into thinking typing.Generic is typing.Protocol.
The side-effect is every sub-protocol is then required to use that Protocol38 definition or face runtime errors - that can no-longer be caught statically
E TypeError: Protocols can only inherit from other protocols, got <class 'narwhals._compliant.series.EagerSeriesNamespace'>"Fixing" 3.(9|10) requires redefining __init__ anyway in the impl.
Which mostly negates the benefit of defining in the Protocol 🤦♂️
Repeat __init__
narwhals/narwhals/_compliant/expr.py
Lines 332 to 345 in 697bb2c
narwhals/narwhals/_arrow/expr.py
Lines 33 to 55 in 697bb2c
narwhals/narwhals/_pandas_like/expr.py
Lines 69 to 91 in 697bb2c
2. Define the constructor another way
(#2261) introduced a different solution - but still needed to use Protocol38 to avoid the metaclass issues of the first solution.
I like this a lot more as it feels more intuitive from the outside - given the clear chain of @classmethod and @property definitions.
when-then
narwhals/narwhals/_arrow/namespace.py
Lines 241 to 242 in 697bb2c
narwhals/narwhals/_arrow/namespace.py
Lines 282 to 285 in 697bb2c
narwhals/narwhals/_compliant/when_then.py
Lines 72 to 82 in 697bb2c
narwhals/narwhals/_compliant/when_then.py
Lines 63 to 64 in 697bb2c
narwhals/narwhals/_compliant/when_then.py
Lines 67 to 70 in 697bb2c
narwhals/narwhals/_compliant/when_then.py
Lines 94 to 115 in 697bb2c
The only downside I could see is __new__ being a less-understood method than __init__.
Personally it took me a while to understand when it makes sense to use.
Even now that I do - I didn't think of it as a solution until a month after (#2064) merged 😅
3. Avoid constructors entirely in Protocol(s)
That is the solution this comment is attached to.
It might be easy to miss, but _SeriesNamespace is a Protocol and EagerSeriesNamespace is a Generic class.
The interesting aspect of that is the Protocol(s) represent the interface - but the concrete impl is separate.
This approach makes sense when the interface and implementation differ greatly.
Here it is ArrowSeries and PandasLikeSeries - but we go fully structural for all of Polars
I forgot that one isn't for `pd.Series`
Not used anywhere else, no need to check `Implementation` so many times
|
Haven't figured out why yet, but the narwhals/utils/generate_backend_completeness.py Lines 103 to 111 in 697bb2c First guess would be the extra properties for narwhals/narwhals/_pandas_like/utils.py Lines 813 to 824 in 7daa8f6 UpdateResolved in (3ff77a0) |
Seems to have solved the issue locally #2294 (comment)
EagerSeries*Namespace protocolsEagerSeries*Namespace protocols
Still need to untangle `CompliantExpr` + friends #2294 (comment)
* refactor: Add `SelectorNamespace.from_namespace` #2294 (comment) * refactor: Reuse pattern from `when_then` * refactor: Only use `Protocol38` in sub `CompliantExpr` Still need to untangle `CompliantExpr` + friends #2294 (comment) * refactor: Simplify typing * refactor: Move `Eval*` aliases to `_compliant.typing`

Closes #2230
What type of PR is this? (check all applicable)
Related issues
_Stores(Native|Compliant)#2130._native_seriesrefs #2293Compliant*protocols #2230CompliantExpr#21193.8#2084Checklist
If you have comments or can explain your changes, please do so below
<3.11compatible approachEagerSeries*Namespaceprotocols #2294 (comment))PandasLike*CatNamespaceDateTimeNamespaceListNamespaceStringNamespaceStructNamespacegenerate_backend_completenessEagerSeries*Namespaceprotocols #2294 (comment))