-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Add missing lifecycle methods and expand docstrings #119
Conversation
- Adds `mode`, `close`, and `closed` methods/properties. - Fills in most of the docstrings with details about parameters and expected behavior.
python-spec/src/somacore/data.py
Outdated
indexed dimensions. | ||
- Specifying ``None`` or an empty sequence (e.g. ``()``) represents | ||
no constraints over any dimension, returning the entire dataset. | ||
TODO: https://github.com/single-cell-data/TileDB-SOMA/pull/910 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this issue is now closed -- ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not as of the time of writing but is now completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no argument; i get the idea we're all doing parallel development
as long as the TODO is gone that's all i ask
python-spec/src/somacore/data.py
Outdated
|
||
Each dimension may be indexed by value or slice: | ||
|
||
- Slices are doubly-inclusive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubly-inclusive -> doubly inclusive
python-spec/src/somacore/data.py
Outdated
- Slices are doubly-inclusive. | ||
- Half-specified slices include all data up to or starting from the | ||
specified bound, inclusive. | ||
- Negative indexing is unsupported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
python-spec/src/somacore/base.py
Outdated
@property | ||
@abc.abstractmethod | ||
def mode(self) -> options.OpenMode: | ||
"""Returns the mode this object was opened in.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to expand to match the docstring in tiledbsoma
, which inclues "... either 'r' or 'w'."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@@ -46,15 +46,26 @@ def create( | |||
platform_config: Optional[options.PlatformConfig] = None, | |||
context: Optional[Any] = None, | |||
) -> Self: | |||
"""Creates a new DataFrame.""" | |||
"""Creates a new ``DataFrame`` at the given URI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to include info on the requirements & defaulting around soma_joinid?
i.e.,
- soma_joinid must have type pyarrow.int64
- if soma_joinid is omitted from the user-provided schema, it will be added
- it may be an indexed column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this information.
python-spec/src/somacore/data.py
Outdated
- Coordinates can be specified as a scalar value, a Python sequence | ||
(``list``, ``tuple``, etc.), a NumPy ndarray, an Arrow array, or | ||
similar objects (as defined by ``SparseDFCoords``). | ||
- Slices are doubly inclusive: ``slice(2, 4)`` means ``[2, 3, 4]``, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubly-inclusive
is a term that I do not often see in Python docs - more common is half-open
or closed
, to track with traditional interval/range nomenclature.
Suggested change for clarity: "Slices are doubly inclusive (i.e., specify a closed range): ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto for other places we use the term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to "closed" in all locations.
index columns (e.g., ``['cell_type', 'tissue_type']``). | ||
All named columns must exist in the schema, and at least one | ||
index column name is required. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quite a few methods are missing lifecycle tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to all methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments for sake of improving clarity. No objections to content. Thanks for doing this!
- Removes the "slice indices must be positive" requirement, per discussion on single-cell-data/SOMA#119. - Enables slicing for string dimensions. - Makes over-indexing too low work the same as too high. Instead of using the non-empty domain for constraining slices, this uses the full domain. This saves the call to get the non-empty domain and works identically. This change also follows up on the previous change to allow "over-indexing" off the low end of a domain to work just as it does when indexing off the high end of a domain: the low end is clipped to the bottom of the domain. So for a dimension with domain [–3, 8]: slice(-100, 100) -> -3, 8 slice(-5, 0) -> -3, 0 slice(5, 10) -> 5, 8 slice(-100, -50) -> error (too low) slice(50, 100) -> error (too high) Since this uses the New And Improved `is_slice_of` function, this also pulls in the new somacore (where that is the only change present).
- Removes the "slice indices must be positive" requirement, per discussion on single-cell-data/SOMA#119. - Enables slicing for string dimensions. This change also follows up on the previous change to allow "over-indexing" off the low end of a domain to work just as it does when indexing off the high end of a domain: the low end is clipped to the bottom of the domain. So for a dimension with domain [–3, 8]: slice(-100, 100) -> -3, 8 slice(-5, 0) -> -3, 0 slice(5, 10) -> 5, 8 slice(-100, -50) -> error (too low) slice(50, 100) -> error (too high) Since this uses the New And Improved `is_slice_of` function, this also pulls in the new somacore (where that is the only change present).
- Removes the "slice indices must be positive" requirement, per discussion on single-cell-data/SOMA#119. - Enables slicing for string dimensions. This change also follows up on the previous change to allow "over-indexing" off the low end of a domain to work just as it does when indexing off the high end of a domain: the low end is clipped to the bottom of the domain. So for a dimension with domain [−3, 8]: slice(-100, 100) -> -3, 8 slice(-5, 0) -> -3, 0 slice(5, 10) -> 5, 8 slice(-100, -50) -> error (too low) slice(50, 100) -> error (too high) Since this uses the New And Improved `is_slice_of` function, this also pulls in the new somacore (where that is the only change present).
- Removes the "slice indices must be positive" requirement, per discussion on single-cell-data/SOMA#119. - Enables slicing for string dimensions. This change also follows up on the previous change to allow "over-indexing" off the low end of a domain to work just as it does when indexing off the high end of a domain: the low end is clipped to the bottom of the domain. So for a dimension with domain [−3, 8]: slice(-100, 100) -> -3, 8 slice(-5, 0) -> -3, 0 slice(5, 10) -> 5, 8 slice(-100, -50) -> error (too low) slice(50, 100) -> error (too high) Since this uses the New And Improved `is_slice_of` function, this also pulls in the new somacore (where that is the only change present).
mode
,close
, andclosed
methods/properties.This change depends upon the result of single-cell-data/TileDB-SOMA#910.
Part of #107 and single-cell-data/TileDB-SOMA#638.