feat(DRAFT): Add `get_supertype` by dangotbanned · Pull Request #3396 · narwhals-dev/narwhals

dangotbanned · 2026-01-10T12:20:21Z

Description

Important

@FBruzzesi and I have been + are still iterating on this
Core functionality is there, focusing on readability, performance + shrinking the test suite

This PR implements polars' concept of supertyping - which more generally defines which types can be safely promoted/demoted/cast to other types.

I really like the DuckDB visualization of their version¹ of these rules, so here's that for an example:

Show Casting Operations Matrix

This is a preliminary step for implementing relaxed concat (#3386).
The aim is we own a consistent set of rules that all/most backends can participate in.
We've already dropped some supertypes that are valid in polars, but may prove challenging in other backends such as #121.
Some others are directly mentioned in comments (e.g. (Struct, DType) -> Struct)

Additional use-cases

Supertyping in polars is used for much more than just a subset of concat.
In (#2572), it is one of the larger concepts missing from the intermediate representation (see #3386 (comment)).

polars-plan::plans::conversion::type_coercion is full of examples of how deeply related the concept is with expressions.
My aim is not to reproduce all of that 😅 - but to be able to reason about DTypes between LazyFrame operations without querying the backend for a Schema between every step 🤞

Related issues

Tasks

DuckDB also mentions another set of rules called Combination Casting - that is entirely implicit.
The matrix doesn't relfect these and only one cast example is given, but it would apply to nw.concat:
"This combination casting occurs for ..., set operations (UNION / EXCEPT / INTERSECT), and ..." ↩

Much easier to pick one to debug this way

We can safely use an unbounded `@cache`, because there can only be 16 valid pairs

Oops

Child of #3396

Makes it much more visible which types are **really** versioned

FBruzzesi · 2026-01-25T11:23:26Z

narwhals/dtypes/_supertyping.py

+SameTemporalT = TypeVar("SameTemporalT", Datetime, DatetimeV1, Duration, DurationV1)
+"""Temporal data types, with a `time_unit` attribute."""
+
+SameDatetimeT = TypeVar("SameDatetimeT", Datetime, DatetimeV1)
+SameT = TypeVar(
+    "SameT", Array, List, Struct, Datetime, DatetimeV1, Duration, DurationV1, Enum
+)


@dangotbanned I am not a big fan of these naming to be honest. The TypeVar is already suggesting that two object will be the same.

I think renaming as:

-SameTemporalT = TypeVar("SameTemporalT", Datetime, DatetimeV1, Duration, DurationV1) +TemporalT = TypeVar("TemporalT", Datetime, DatetimeV1, Duration, DurationV1) """Temporal data types, with a `time_unit` attribute.""" -SameDatetimeT = TypeVar("SameDatetimeT", Datetime, DatetimeV1) -SameT = TypeVar( - "SameT", Array, List, Struct, Datetime, DatetimeV1, Duration, DurationV1, Enum +DatetimeT = TypeVar("DatetimeT", Datetime, DatetimeV1) +ParametricT = TypeVar( + "ParametricT", Array, List, Struct, Datetime, DatetimeV1, Duration, DurationV1, Enum )

would be better

I'm trying to convey the difference between constrained and bound TypeVars.

https://typing.python.org/en/latest/spec/generics.html#introduction

https://typing.python.org/en/latest/spec/generics.html#type-variables-with-an-upper-bound

The only convention I've seen for constrained is like this:

AorB = TypeVar("AorB", A, B)

But it doesn't scale well to more than 2 types 😔

So I've been using Same* for this purpose.

A more verbose, but probably more accurate name would be:

EitherABCD = TypeVar("EitherABCD", A, B, C, D)

The main point is this kind of typing will reject A | B, meaning you need to have narrowed to exactly one of the constraints

for more information, see https://pre-commit.ci

- Typing needed work - Think it should have lower priority - The order of operands doesn't matter, they both have `.inner`

The comment can be code 😉

See #3396 (comment)

> and it serves a single purpose in the codebase #3396 (comment)

dangotbanned · 2026-02-06T17:31:45Z

narwhals/dtypes/_supertyping.py

+DEC128_MAX_PREC = 38
+# Precomputing powers of 10 up to 10^38
+POW10_LIST = tuple(10**i for i in range(DEC128_MAX_PREC + 1))
+INT_MAX_MAP: Mapping[IntegerType, int] = {
+    UInt8(): (2**8) - 1,
+    UInt16(): (2**16) - 1,
+    UInt32(): (2**32) - 1,
+    UInt64(): (2**64) - 1,
+    Int8(): (2**7) - 1,
+    Int16(): (2**15) - 1,
+    Int32(): (2**31) - 1,
+    Int64(): (2**63) - 1,
+}


A few notes on this:

I've kept this as one big comment since we're already at 77! 😳

(1) Could we be lazy-er?

I would prefer if we defer generating this until it is needed.

E.g. I'd expect _integer_supertyping and _primitive_numeric_supertyping to be more commonly used - but even they don't exist at module-import-time

(2) DType vs type[DType] keys

I think this is the only place we have instances as mapping keys, not sure why?

For example, it means each call here instantiates more DTypes, when we could just use the type itself 😅

narwhals/narwhals/dtypes/_supertyping.py

Line 355 in 548e5b8

if integer in {UInt128(), Int128()}:

(3) NumericType.max?

I had a look upstream and it seems in the direction of (#3396 (comment)) and (#3396 (comment)).

What do you think about adding these maximums to the classes, (similar to _bits)?
That way we could compare directly and probably avoid the lookup table

(4) Minor tweak

I think this is the more efficient way to do these calculations.

Note
It would be exactly the second time I've found a use-case for bitshifting operators 😂

INT_MAX_MAP: Mapping[IntegerType, int] = { - UInt8(): (2**8) - 1, - UInt16(): (2**16) - 1, - UInt32(): (2**32) - 1, - UInt64(): (2**64) - 1, - Int8(): (2**7) - 1, - Int16(): (2**15) - 1, - Int32(): (2**31) - 1, - Int64(): (2**63) - 1, + UInt8(): (1 << 8) - 1, + UInt16(): (1 << 16) - 1, + UInt32(): (1 << 32) - 1, + UInt64(): (1 << 64) - 1, + Int8(): (1 << 7) - 1, + Int16(): (1 << 15) - 1, + Int32(): (1 << 31) - 1, + Int64(): (1 << 63) - 1, }

Related #3386, #3396, #3398

As much as is possible without #3396

Need to decide how many of the others to leave as todos Main theme is needing `get_supertype` (#3396)

Everything left requires `get_supertype` (#3396)

* refactor: Replace `_same_supertype` with a custom `@singledispatch` This is more generally useful and a LOT easier to read from the outside * refactor: Just use a real class * fix(typing): Satisfy `mypy` * fix: Oops forgot the first element * refactor(typing): Use slightly better names * chore: Rename `default` -> `upper_bound` * docs: Replace debugging doc * docs: More cleanup * refactor: Use `__slots__`, remove a field * docs: More, more cleanup * docs: lil bit of `.register` progress * cov * test: Get full coverage for `@just_dispatch` * chore: Give it a simple repr * test: Oops, forgot that was an override * revert: Keep only what is required See #3396 (comment) * refactor: Simplify `@just_dispatch` signature * fix(typing): Satisfy mypy * test: Gotta get that coverage Resolves #3410 (comment) * docs: Restore a minimal version of `@just_dispatch` doc Resolves #3410 (comment) * revert: Remove `Impl` alias #3410 (comment) * refactor: Rename `Passthrough` -> `PassthroughFn` Suggested in #3410 (review) * docs: Add note to use only on internal Suggested in #3410 (review)

FBruzzesi and others added 30 commits January 3, 2026 20:27

WIP

0f5905d

chore: Appease ruff

bd7d0d9

chore: Appease mypy

dcd6d79

test: xfail a todo

560d577

refactor: Split out supertyping from dtypes

b40a6a9

chore: Define integer bits in the class def

dade8f2

start adding typing

5892f7a

fix: wow

36ce6ea

refactor: Use IntegerType._bits

46c21a2

skip using DTypes for IntegerTypes

ea6b143

a bit more typing-friendly

3743462

None can't return here

37ba753

refactor: No versioning needed for Float64

c2d9ddf

test: Add ids for tests

48c952d

Much easier to pick one to debug this way

change the problem to be which function

34c8088

Generate a (cached) IntegerType search space instead

3f4706d

docs: Add links for Enum

1a0f193

cheaper float compare

9a19344

cheaper int, float compare

1e64d47

add DType.__eq__ todo

2fd8800

avoid repeating binary checks

7425ecc

perf: Use min(..., key=...) and move cache for _min_time_unit

69949f0

We can safely use an unbounded `@cache`, because there can only be 16 valid pairs

why not do the same for FloatType?

9b58520

test(DRAFT): Try to get more cov

433e439

test: Almost full cov

63a1764

test: And yet more coverage

f5b65ef

docs: Add todos for temporal -> numeric

b5521e4

perf: Don't lookup the type you have already!

0074caa

Oops

refactor: Generalize _max_float

9488191

Make (Date, Datetime) preferable to Numeric

89043d1

dangotbanned added a commit that referenced this pull request Jan 22, 2026

docs: Remove repeat inline supertype docs

0a44dd8

Child of #3396

dangotbanned mentioned this pull request Jan 22, 2026

docs: Remove repeat inline supertype docs #3411

Merged

docs: Remove repeat inline supertype docs (#3411)

ed1b614

dangotbanned mentioned this pull request Jan 23, 2026

feat(expr-ir): Add BaseFrame.unnest #3414

Merged

refactor: Deduplicate v1 dtypes imports

d66d07f

Makes it much more visible which types are **really** versioned

FBruzzesi reviewed Jan 25, 2026

View reviewed changes

FBruzzesi and others added 11 commits January 25, 2026 18:03

merge main and solve conflicts

fdd1bc3

WIP: Add support for (Decimal, Decimal)

42caf17

{Decimal, IntegerType} case

ec0401f

simplify a bit

b5ae62a

fix typing issue + function renaming

8dbd2ef

merge main

69b07f3

add support for {List, Array} -> List

3ad1639

add support for {String, X} -> String, for X not Binary

8d9e053

update docs for {String, X} and {List, Array} cases

ef2c7db

Merge branch 'main' into dtypes/supertyping

88683e0

[pre-commit.ci] auto fixes from pre-commit.com hooks

8ecb636

for more information, see https://pre-commit.ci

FBruzzesi mentioned this pull request Jan 31, 2026

feat: Disallow casting temporal to numeric #3430

Open

10 tasks

dangotbanned added 3 commits February 3, 2026 15:27

Merge remote-tracking branch 'upstream/main' into dtypes/supertyping

eeb822d

fiddle with (Array, List) -> List

9ed0d17

- Typing needed work - Think it should have lower priority - The order of operands doesn't matter, they both have `.inner`

refactor: Reduce String branching

548e5b8

The comment can be code 😉

dangotbanned added a commit that referenced this pull request Feb 3, 2026

revert: Keep only what is required

14f81fc

See #3396 (comment)

dangotbanned added a commit that referenced this pull request Feb 3, 2026

refactor: Replace (polars) native_to_narwhals_dtype

425162e

> and it serves a single purpose in the codebase #3396 (comment)

dangotbanned commented Feb 6, 2026

View reviewed changes

dangotbanned added a commit that referenced this pull request Feb 7, 2026

feat: Add concat, remove HConcat.strict

63076c6

Related #3386, #3396, #3398

dangotbanned added a commit that referenced this pull request Feb 14, 2026

feat: Partial impl Resolver.join

c980246

As much as is possible without #3396

dangotbanned added a commit that referenced this pull request Feb 16, 2026

feat(DRAFT): Add some easy Function._resolve_dtypes

fd494df

Need to decide how many of the others to leave as todos Main theme is needing `get_supertype` (#3396)

Merge branch 'main' into dtypes/supertyping

bb2846f

dangotbanned added a commit that referenced this pull request Feb 17, 2026

docs: Make _resolve_dtype gaps more visible

a86e4ee

Everything left requires `get_supertype` (#3396)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(DRAFT): Add `get_supertype`#3396

feat(DRAFT): Add `get_supertype`#3396
dangotbanned wants to merge 126 commits intomainfrom
dtypes/supertyping

dangotbanned commented Jan 10, 2026 •

edited by FBruzzesi

Loading

Uh oh!

FBruzzesi Jan 25, 2026

Uh oh!

dangotbanned Jan 25, 2026

Uh oh!

dangotbanned Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dangotbanned commented Jan 10, 2026 • edited by FBruzzesi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Additional use-cases

Related issues

Tasks

Footnotes

Uh oh!

FBruzzesi Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

dangotbanned Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

dangotbanned Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

(1) Could we be lazy-er?

(2) DType vs type[DType] keys

(3) NumericType.max?

(4) Minor tweak

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dangotbanned commented Jan 10, 2026 •

edited by FBruzzesi

Loading

dangotbanned Feb 6, 2026 •

edited

Loading

(2) `DType` vs `type[DType]` keys

(3) `NumericType.max`?