Skip to content

feat(typing): Make IntoBackend generic#3002

Merged
FBruzzesi merged 16 commits intochore/expose-into-backendfrom
into-backend-generic
Aug 17, 2025
Merged

feat(typing): Make IntoBackend generic#3002
FBruzzesi merged 16 commits intochore/expose-into-backendfrom
into-backend-generic

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Aug 16, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Tasks

- New `SCREAMING_SNAKE_CASE` group for `Implementation`
  - Used only in `_compliant` (directly)
- Everything is a runtime typing symbol
These examples will show up everywhere, without needing repeating in `Into<...>Backend`
@dangotbanned dangotbanned added enhancement New feature or request typing labels Aug 16, 2025
Comment on lines 77 to 83
Backend: TypeAlias = Literal[EagerAllowed, LazyAllowed]
"""Ooh look, a description!"""


BackendT = TypeVar("BackendT", bound=Backend)
IntoBackend: TypeAlias = Union[BackendT, ModuleType]
"""Anything that can be converted into a Narwhals Implementation.
Copy link
Member Author

@dangotbanned dangotbanned Aug 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi

So my question is: how come that BackendT = TypeVar("BackendT", bound=Backend) is a generic alias?

Short answer

BackendT is not a generic alias.
It is a TypeVar used in a generic alias named IntoBackend

Longer answer

I think this concept is easier to understand using TypeAliasType

I'll build up the equivalent of IntoBackend using that and more explicit names:

from __future__ import annotations

from types import ModuleType
from typing import TYPE_CHECKING, Literal, Union

from typing_extensions import TypeAliasType

from narwhals._typing import EagerAllowed, LazyAllowed
from narwhals._typing_compat import TypeVar

if TYPE_CHECKING:
    from typing_extensions import TypeAlias


# `Backend` (equivalent to one or more of the members in `Literal`)
BackendAlias: TypeAlias = Literal[EagerAllowed, LazyAllowed]

# `BackendT` (as above, but *remembers* (narrows to) what we passed in)
BackendTypeVar = TypeVar("BackendTypeVar", bound=BackendAlias)


# `IntoBackend` (as above, and expands into `BackendTypeVar | ModuleType`)
IntoBackendGenericAlias = TypeAliasType(
    "IntoBackendGenericAlias",
    Union[BackendTypeVar, ModuleType],
    type_params=(BackendTypeVar,),  # <--- Parameterizing an alias makes the alias generic
)

# `IntoBackendAny` (similar concept, but no narrowing)
IntoBackendConcreteAlias: TypeAlias = Union[BackendAlias, ModuleType]


# narrowed to `Literal["polars"] | ModuleType`
IntoBackendSubscribedAlias1: TypeAlias = IntoBackendGenericAlias[Literal["polars"]]


# `IntoBackendEager` (narrowed to `<everything in `EagerAllowed`> | ModuleType`)
IntoBackendSubscribedAlias2: TypeAlias = IntoBackendGenericAlias[EagerAllowed]


# `IntoBackendBad` (narrowed to `Unknown | ModuleType`, because "bad" is not assignable to `BackendAlias`)
IntoBackendSubscribedAlias3: TypeAlias = IntoBackendGenericAlias[Literal["bad"]]  # type: ignore[type-var]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax for this from 3.12 is a lot nicer https://docs.python.org/3/library/typing.html#type-aliases

from __future__ import annotations

from types import ModuleType

from narwhals._typing import EagerAllowed, LazyAllowed

type IntoBackend[T: EagerAllowed | LazyAllowed] = T | ModuleType
type IntoBackendAny = IntoBackend[EagerAllowed | LazyAllowed]
type IntoBackendEager = IntoBackend[EagerAllowed]

#3002 (comment)

Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
@dangotbanned
Copy link
Member Author

dangotbanned commented Aug 17, 2025

Final round of bikeshedding

(#2971 (comment))

narwhals/narwhals/_utils.py

Lines 1612 to 1630 in f2932a6

_CanCollectInto: TypeAlias = Literal[
Implementation.PANDAS, Implementation.POLARS, Implementation.PYARROW
]
_CanLazyInto: TypeAlias = Literal[
Implementation.DASK, Implementation.DUCKDB, Implementation.POLARS, Implementation.IBIS
]
def can_collect_into(obj: Implementation) -> TypeIs[_CanCollectInto]:
return obj in {Implementation.PANDAS, Implementation.POLARS, Implementation.PYARROW}
def can_lazy_into(obj: Implementation) -> TypeIs[_CanLazyInto]:
return obj in {
Implementation.DASK,
Implementation.DUCKDB,
Implementation.POLARS,
Implementation.IBIS,
}

The conventions I've been trying to use in narwhals so far are:

(S|s)upports* for dunder protocols

narwhals/narwhals/_utils.py

Lines 120 to 124 in f2932a6

class _SupportsVersion(Protocol):
__version__: str
class _SupportsGet(Protocol): # noqa: PYI046
def __get__(self, instance: Any, owner: Any | None = None, /) -> Any: ...

narwhals/narwhals/_utils.py

Lines 1633 to 1642 in f2932a6

def has_native_namespace(obj: Any) -> TypeIs[SupportsNativeNamespace]:
return _hasattr_static(obj, "__native_namespace__")
def _supports_dataframe_interchange(obj: Any) -> TypeIs[DataFrameLike]:
return hasattr(obj, "__dataframe__")
def supports_arrow_c_stream(obj: Any) -> TypeIs[ArrowStreamExportable]:
return _hasattr_static(obj, "__arrow_c_stream__")

Stores* for properties/attributes

narwhals/narwhals/_utils.py

Lines 126 to 128 in f2932a6

class _StoresColumns(Protocol):
@property
def columns(self) -> Sequence[str]: ...

narwhals/narwhals/_utils.py

Lines 139 to 183 in f2932a6

class _StoresNative(Protocol[NativeT_co]): # noqa: PYI046
"""Provides access to a native object.
Native objects have types like:
>>> from pandas import Series
>>> from pyarrow import Table
"""
@property
def native(self) -> NativeT_co:
"""Return the native object."""
...
class _StoresCompliant(Protocol[CompliantT_co]): # noqa: PYI046
"""Provides access to a compliant object.
Compliant objects have types like:
>>> from narwhals._pandas_like.series import PandasLikeSeries
>>> from narwhals._arrow.dataframe import ArrowDataFrame
"""
@property
def compliant(self) -> CompliantT_co:
"""Return the compliant object."""
...
class _StoresBackendVersion(Protocol):
@property
def _backend_version(self) -> tuple[int, ...]:
"""Version tuple for a native package."""
...
class _StoresVersion(Protocol):
_version: Version
"""Narwhals API version (V1 or MAIN)."""
class _StoresImplementation(Protocol):
_implementation: Implementation
"""Implementation of native object (pandas, Polars, PyArrow, ...)."""

Into* for input argument coercion

This initially seems like it would be a good fit, but we're talking about the type of the Implementation passed into a narwhals object that has it's own Implementation.

Maybe that consistitutes a conversion?

I've been careful to label this group with coercion, since what we are usually doing is taking a wide range of types and normalizing them to a single type - before using that for some operation.

So an Into* is usually not describing a set of types, each with varying behavior 🤔

IntoBackend: TypeAlias = Union[BackendT, ModuleType]

IntoExpr: TypeAlias = Union["Expr", str, "Series[Any]"]

Into1DArray: TypeAlias = "_1DArray | _NumpyScalar"

IntoDType: TypeAlias = "dtypes.DType | type[NonNestedDType]"

IntoSchema: TypeAlias = "Mapping[str, dtypes.DType] | Schema"

IntoArrowTable: TypeAlias = "ArrowStreamExportable | pa.Table"

Using the method name directly for mostly single method protocols

class ToNumpy(Protocol[ToNumpyT_co]):
def to_numpy(self, *args: Any, **kwds: Any) -> ToNumpyT_co: ...

class FromNumpy(Protocol[FromNumpyT_contra]):
@classmethod
def from_numpy(cls, data: FromNumpyT_contra, *args: Any, **kwds: Any) -> Self: ...

class FromIterable(Protocol[FromIterableT_contra]):
@classmethod
def from_iterable(
cls, data: Iterable[FromIterableT_contra], *args: Any, **kwds: Any
) -> Self: ...

class ToDict(Protocol[ToDictDT_co]):
def to_dict(self, *args: Any, **kwds: Any) -> ToDictDT_co: ...
class FromDict(Protocol[FromDictDT_contra]):
@classmethod
def from_dict(cls, data: FromDictDT_contra, *args: Any, **kwds: Any) -> Self: ...

class ToArrow(Protocol[ToArrowT_co]):
def to_arrow(self, *args: Any, **kwds: Any) -> ToArrowT_co: ...
class FromArrow(Protocol[FromArrowDT_contra]):
@classmethod
def from_arrow(cls, data: FromArrowDT_contra, *args: Any, **kwds: Any) -> Self: ...

class FromNative(Protocol[FromNativeT]):
@classmethod
def from_native(cls, data: FromNativeT, *args: Any, **kwds: Any) -> Self: ...
@staticmethod
def _is_native(obj: FromNativeT | Any, /) -> TypeIs[FromNativeT]:
"""Return `True` if `obj` can be passed to `from_native`."""
...

class ToNarwhals(Protocol[ToNarwhalsT_co]):
def to_narwhals(self) -> ToNarwhalsT_co:
"""Convert into public representation."""
...

*Convertible for a {From*,To*} pair

class NumpyConvertible(
ToNumpy[ToNumpyT_co],
FromNumpy[FromNumpyDT_contra],
Protocol[ToNumpyT_co, FromNumpyDT_contra],
):
def to_numpy(self, dtype: Any, *, copy: bool | None) -> ToNumpyT_co: ...

class DictConvertible(
ToDict[ToDictDT_co],
FromDict[FromDictDT_contra],
Protocol[ToDictDT_co, FromDictDT_contra],
): ...

class ArrowConvertible(
ToArrow[ToArrowT_co],
FromArrow[FromArrowDT_contra],
Protocol[ToArrowT_co, FromArrowDT_contra],
): ...

What's going on here doesn't fit into any of those boxes 😭

Update 1

(00e9ed0) moved them to _typing.py

Update 2

Resolved in (refactor: Rename, re-doc)

Comment on lines +137 to +141


IntoBackendAny: TypeAlias = IntoBackend[Backend]
IntoBackendEager: TypeAlias = IntoBackend[EagerAllowed]
IntoBackendLazy: TypeAlias = IntoBackend[LazyAllowed]
Copy link
Member Author

@dangotbanned dangotbanned Aug 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used these yet.

The downside to them is the need to repeat (or condense) the docs from IntoBackend.

Suggested change
IntoBackendAny: TypeAlias = IntoBackend[Backend]
IntoBackendEager: TypeAlias = IntoBackend[EagerAllowed]
IntoBackendLazy: TypeAlias = IntoBackend[LazyAllowed]

If the additional length is an issue, we could alias *Allowed like this to get the benefits of both:

Eager: TypeAlias = EagerAllowed
Lazy: TypeAlias = LazyAllowed

def some_function(backend: IntoBackend[Eager]): ...

But honestly all of this is shorter than main anyway 😄:

def some_function(backend: ModuleType | Implementation | str): ...

wow pretty big one I missed there 😅
@dangotbanned dangotbanned marked this pull request as ready for review August 17, 2025 14:07
@dangotbanned dangotbanned requested a review from FBruzzesi August 17, 2025 16:43
dangotbanned added a commit that referenced this pull request Aug 17, 2025
@FBruzzesi
Copy link
Member

@dangotbanned I think this is ready to merge? What's left? If it's just variable naming we can chat about it to make it the loop quicker

@dangotbanned
Copy link
Member Author

@dangotbanned I think this is ready to merge? What's left? If it's just variable naming we can chat about it to make it the loop quicker

Yeah I'm happy with it!

I just wanted to make sure you were before merging back into your PR 😊

@FBruzzesi
Copy link
Member

Yeah I'm happy with it!

I just wanted to make sure you were before merging back into your PR 😊

Yep! Let's do it!

@FBruzzesi FBruzzesi merged commit f12b6e8 into chore/expose-into-backend Aug 17, 2025
32 of 33 checks passed
@FBruzzesi FBruzzesi deleted the into-backend-generic branch August 17, 2025 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request typing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants