Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Add missing lifecycle methods and expand docstrings #119

Merged
merged 6 commits into from
Feb 13, 2023

Conversation

thetorpedodog
Copy link
Contributor

@thetorpedodog thetorpedodog commented Feb 9, 2023

  • Adds mode, close, and closed methods/properties.
  • Fills in most of the docstrings with details about parameters and expected behavior.

This change depends upon the result of single-cell-data/TileDB-SOMA#910.

Part of #107 and single-cell-data/TileDB-SOMA#638.

- Adds `mode`, `close`, and `closed` methods/properties.
- Fills in most of the docstrings with details about parameters and
  expected behavior.
@johnkerl johnkerl changed the title Add missing lifecycle methods and expand docstrings. [python] Add missing lifecycle methods and expand docstrings Feb 10, 2023
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
indexed dimensions.
- Specifying ``None`` or an empty sequence (e.g. ``()``) represents
no constraints over any dimension, returning the entire dataset.
TODO: https://github.com/single-cell-data/TileDB-SOMA/pull/910
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this issue is now closed -- ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not as of the time of writing but is now completed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no argument; i get the idea we're all doing parallel development

as long as the TODO is gone that's all i ask

python-spec/src/somacore/data.py Outdated Show resolved Hide resolved

Each dimension may be indexed by value or slice:

- Slices are doubly-inclusive.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubly-inclusive -> doubly inclusive

- Slices are doubly-inclusive.
- Half-specified slices include all data up to or starting from the
specified bound, inclusive.
- Negative indexing is unsupported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
python-spec/src/somacore/data.py Outdated Show resolved Hide resolved
@property
@abc.abstractmethod
def mode(self) -> options.OpenMode:
"""Returns the mode this object was opened in."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to expand to match the docstring in tiledbsoma, which inclues "... either 'r' or 'w'."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@@ -46,15 +46,26 @@ def create(
platform_config: Optional[options.PlatformConfig] = None,
context: Optional[Any] = None,
) -> Self:
"""Creates a new DataFrame."""
"""Creates a new ``DataFrame`` at the given URI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to include info on the requirements & defaulting around soma_joinid?

i.e.,

  • soma_joinid must have type pyarrow.int64
  • if soma_joinid is omitted from the user-provided schema, it will be added
  • it may be an indexed column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this information.

- Coordinates can be specified as a scalar value, a Python sequence
(``list``, ``tuple``, etc.), a NumPy ndarray, an Arrow array, or
similar objects (as defined by ``SparseDFCoords``).
- Slices are doubly inclusive: ``slice(2, 4)`` means ``[2, 3, 4]``,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubly-inclusive is a term that I do not often see in Python docs - more common is half-open or closed, to track with traditional interval/range nomenclature.

Suggested change for clarity: "Slices are doubly inclusive (i.e., specify a closed range): ..."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto for other places we use the term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "closed" in all locations.

index columns (e.g., ``['cell_type', 'tissue_type']``).
All named columns must exist in the schema, and at least one
index column name is required.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite a few methods are missing lifecycle tags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to all methods

Copy link
Member

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments for sake of improving clarity. No objections to content. Thanks for doing this!

@thetorpedodog thetorpedodog merged commit 8922de4 into main Feb 13, 2023
@thetorpedodog thetorpedodog deleted the finishing-docs branch February 13, 2023 16:30
thetorpedodog added a commit to single-cell-data/TileDB-SOMA that referenced this pull request Feb 15, 2023
- Removes the "slice indices must be positive" requirement,
  per discussion on single-cell-data/SOMA#119.
- Enables slicing for string dimensions.
- Makes over-indexing too low work the same as too high.

Instead of using the non-empty domain for constraining slices, this uses
the full domain.  This saves the call to get the non-empty domain and
works identically.

This change also follows up on the previous change to allow
"over-indexing" off the low end of a domain to work just as it does
when indexing off the high end of a domain: the low end is clipped to
the bottom of the domain. So for a dimension with domain [–3, 8]:

    slice(-100, 100) -> -3, 8
    slice(-5, 0) -> -3, 0
    slice(5, 10) -> 5, 8
    slice(-100, -50) -> error (too low)
    slice(50, 100) -> error (too high)

Since this uses the New And Improved `is_slice_of` function, this also
pulls in the new somacore (where that is the only change present).
thetorpedodog added a commit to single-cell-data/TileDB-SOMA that referenced this pull request Feb 15, 2023
- Removes the "slice indices must be positive" requirement,
  per discussion on single-cell-data/SOMA#119.
- Enables slicing for string dimensions.

This change also follows up on the previous change to allow
"over-indexing" off the low end of a domain to work just as it does
when indexing off the high end of a domain: the low end is clipped to
the bottom of the domain. So for a dimension with domain [–3, 8]:

    slice(-100, 100) -> -3, 8
    slice(-5, 0) -> -3, 0
    slice(5, 10) -> 5, 8
    slice(-100, -50) -> error (too low)
    slice(50, 100) -> error (too high)

Since this uses the New And Improved `is_slice_of` function, this also
pulls in the new somacore (where that is the only change present).
thetorpedodog added a commit to single-cell-data/TileDB-SOMA that referenced this pull request Feb 15, 2023
- Removes the "slice indices must be positive" requirement,
  per discussion on single-cell-data/SOMA#119.
- Enables slicing for string dimensions.

This change also follows up on the previous change to allow
"over-indexing" off the low end of a domain to work just as it does
when indexing off the high end of a domain: the low end is clipped to
the bottom of the domain. So for a dimension with domain [−3, 8]:

    slice(-100, 100) -> -3, 8
    slice(-5, 0) -> -3, 0
    slice(5, 10) -> 5, 8
    slice(-100, -50) -> error (too low)
    slice(50, 100) -> error (too high)

Since this uses the New And Improved `is_slice_of` function, this also
pulls in the new somacore (where that is the only change present).
thetorpedodog added a commit to single-cell-data/TileDB-SOMA that referenced this pull request Feb 15, 2023
- Removes the "slice indices must be positive" requirement,
  per discussion on single-cell-data/SOMA#119.
- Enables slicing for string dimensions.

This change also follows up on the previous change to allow
"over-indexing" off the low end of a domain to work just as it does
when indexing off the high end of a domain: the low end is clipped to
the bottom of the domain. So for a dimension with domain [−3, 8]:

    slice(-100, 100) -> -3, 8
    slice(-5, 0) -> -3, 0
    slice(5, 10) -> 5, 8
    slice(-100, -50) -> error (too low)
    slice(50, 100) -> error (too high)

Since this uses the New And Improved `is_slice_of` function, this also
pulls in the new somacore (where that is the only change present).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants