Rework and enhance type hierarchy and generics #115

thetorpedodog · 2023-02-03T19:08:02Z

Context: single-cell-data/TileDB-SOMA#638 and single-cell-data/TileDB-SOMA#540

Pulls basic collection impl into BaseCollection, allowing Collection to add the semantics of "no semantics". This mirrors the implementation in tiledbsoma.
Makes Measurement and Experiment inherit from BaseCollection. Previous problems with this were due to missing __slots__.
Adds generic parameters to Measurement and Experiment that allow implementations to specify the exact types they provide, saving a lot of casting down the road.
Renames lots of TypeVars to be clearer about their purpose.
Adds overloads to add_new_collection for better type inference.
Tightens _Experimentish to only expect read accessors, not writers.

While these new changes add a bunch of generic slots to the base collection types and experiments and measurements, the experience from the perspective of a SOMA library user will be roughly the same. That is to say, it's a little scary here, but the end user will still see theimpl.Collection[ElementType]. Type inference when using composed objects is better as well:

some_exp = theimpl.Experiment.open(...)

obs = some_exp.obs
reveal_type(obs)
# BEFORE: somacore.DataFrame
#         (i.e., the type system doesn't know what implementation
#         of the abstract DataFrame this is; it only knows about
#         the bare minimum DataFrame properties)
# AFTER:  theimpl.DataFrame

ms = some_exp.ms
reveal_type(ms)
# BEFORE: somacore.Collection[somacore.Measurement]
# AFTER:  theimpl.Collection[theimpl.Measurement]

some_meas = ms["whatever"]
reveal_type(ms)
# BEFORE: somacore.Measurement
# AFTER:  theimpl.Measurement

some_meas.X
reveal_type(ms)
# BEFORE: somacore.Collection[somacore.NDArray]
# AFTER:  theimpl.Collection[theimpl.NDArray]

There is no change at runtime; the actual types of the objects remain the same, but autocompletion, type checking, and other tooling has a much better idea of what is going on.

To show what this looks like on the tiledbsoma side, the diff is pretty small, but the key part is in io.py, where the cast(tiledbsoma.Measurement, ms[whatever]) no longer needs to happen, since the type system already knows it’s a tiledbsoma.Measurement. While that is the only change there specifically, there will be corresponding improvements in user code.

And just to reiterate: runtime behavior is identical, and any code which works now will continue to work, but static type inference is significantly improved.

- Pulls basic collection impl into BaseCollection, allowing Collection to add the semantics of "no semantics". This mirrors the implementation in tiledbsoma. - Makes Measurement and Experiment inherit from BaseCollection. Previous problems with this were due to missing `__slots__`. - Adds generic parameters to Measurement and Experiment that allow implementations to specify the exact types they provide, saving a lot of `cast`ing down the road. - Renames lots of TypeVars to be clearer about their purpose. - Adds overloads to `add_new_collection` for better type inference. - Tightens `_Experimentish` to only expect read accessors, not writers. While these new changes add a bunch of generic slots to the base collection types and experiments and measurements, the experience from the perspective of a SOMA library user will be roughly the same. That is to say, it's a little scary here, but the end user will still see `theimpl.Collection[ElementType]`. Type inference when using composed objects is better as well: some_exp = theimpl.Experiment.open(...) obs = some_exp.obs reveal_type(obs) # BEFORE: somacore.DataFrame # (i.e., the type system doesn't know what implementation # of the abstract DataFrame this is; it only knows about # the bare minimum DataFrame properties) # AFTER: theimpl.DataFrame ms = some_exp.ms reveal_type(ms) # BEFORE: somacore.Collection[somacore.Measurement] # AFTER: theimpl.Collection[theimpl.Measurement] some_meas = ms["whatever"] reveal_type(ms) # BEFORE: somacore.Measurement # AFTER: theimpl.Measurement some_meas.X reveal_type(ms) # BEFORE: somacore.Collection[somacore.NDArray] # AFTER: theimpl.Collection[theimpl.NDArray] There is no change at runtime; the actual types of the objects remain the same, but autocompletion, type checking, and other tooling has a *much* better idea of what is going on.

thetorpedodog requested review from mlin, johnkerl and atolopko-czi February 3, 2023 19:08

johnkerl changed the title ~~Rework and enhance type hierarchy and generics.~~ Rework and enhance type hierarchy and generics Feb 3, 2023

johnkerl approved these changes Feb 3, 2023

View reviewed changes

thetorpedodog merged commit 4aab19f into main Feb 3, 2023

thetorpedodog deleted the more-better-types branch February 3, 2023 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework and enhance type hierarchy and generics #115

Rework and enhance type hierarchy and generics #115

thetorpedodog commented Feb 3, 2023 •

edited by johnkerl

Loading

Rework and enhance type hierarchy and generics #115

Rework and enhance type hierarchy and generics #115

Conversation

thetorpedodog commented Feb 3, 2023 • edited by johnkerl Loading

thetorpedodog commented Feb 3, 2023 •

edited by johnkerl

Loading