Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop unused categories in ExperimentAxisQuery.to_anndata #204

Merged
merged 5 commits into from
Jul 24, 2024

Conversation

nguyenv
Copy link
Member

@nguyenv nguyenv commented Jul 24, 2024

Issue:
single-cell-data/TileDB-SOMA#2765

Changes:
ExperimentAxisQuery.to_anndata iterates through each column and drops unused categories.

Notes for Reviewer:
This change was originally proposed in single-cell-data/TileDB-SOMA#2811 but was suggested somacore was a more appropriate location for it.

Copy link
Member

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the discussion here

single-cell-data/TileDB-SOMA#2765 (comment)

we need a new flag on to_anndata:

  • optional
  • default if omitted: current behavior
  • if supplied and True: do what this PR does
  • name: droplevels? Or something else? Let's chat with @pablo-gar @mojaveazure @eddelbuettel

@nguyenv
Copy link
Member Author

nguyenv commented Jul 24, 2024

As placeholder I have chosen drop_levels but will change to the agreed upon name.

@@ -298,7 +299,7 @@ def to_anndata(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a description of what drop_levels does ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pablo-gar @eddelbuettel @mojaveazure are you okay with drop_levels?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Presuming we'll want a same-ish name in Python & R both ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seurat uses drop to match R, but I'm fine with drop_levels in SOMA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

Copy link
Member

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 with a suggested docstring update

python-spec/src/somacore/query/query.py Outdated Show resolved Hide resolved
@nguyenv nguyenv merged commit 5c0c30d into main Jul 24, 2024
6 of 7 checks passed
@nguyenv nguyenv deleted the viviannguyen/drop-unused-cate branch July 24, 2024 18:39
mojaveazure added a commit to single-cell-data/TileDB-SOMA that referenced this pull request Aug 2, 2024
…stors

R analog of #2811 and single-cell-data/SOMA#204; add a `drop_levels`
paramter to the ecosystem outgestors to drop unused factor levels from
resulting data frames

Modified SOMA methods:
 - `SOMAExperimentAxisQuery$to_seurat()`: add `drop_levels` to drop
   drop unused levels from `obs` and `var` data frames
 - `SOMAExperimentAxisQuery$to_seurat_assay()`: add `drop_levels` to
   drop unused levels from `var` data frame
 - `SOMAExperimentAxisQuery$to_single_cell_experiment()`: add
   `drop_levels` to drop unused levels from `obs` and `var` data frames

Also shifts `SOMAExperimentAxisQuery$to_seurat()` and
`SOMAExperimentAxisQuery$to_seurat_assay()` to use
`SOMAExperimentAxisQuery$private$.load_df()` for loading `obs` and
`var`; removing standalone code and increase sharing with the SCE
outgestor

resolves #2765

[SC-51945](https://app.shortcut.com/tiledb-inc/story/51945)
mojaveazure added a commit to single-cell-data/TileDB-SOMA that referenced this pull request Aug 2, 2024
…stors (#2825)

* [r] Add `drop_levels` to `SOMAExperimentAxisQuery` -> ecosystem outgestors
R analog of #2811 and single-cell-data/SOMA#204; add a `drop_levels`
paramter to the ecosystem outgestors to drop unused factor levels from
resulting data frames

Modified SOMA methods:
 - `SOMAExperimentAxisQuery$to_seurat()`: add `drop_levels` to drop
   drop unused levels from `obs` and `var` data frames
 - `SOMAExperimentAxisQuery$to_seurat_assay()`: add `drop_levels` to
   drop unused levels from `var` data frame
 - `SOMAExperimentAxisQuery$to_single_cell_experiment()`: add
   `drop_levels` to drop unused levels from `obs` and `var` data frames

Also shifts `SOMAExperimentAxisQuery$to_seurat()` and
`SOMAExperimentAxisQuery$to_seurat_assay()` to use
`SOMAExperimentAxisQuery$private$.load_df()` for loading `obs` and
`var`; removing standalone code and increase sharing with the SCE
outgestor

resolves #2765

[SC-51945](https://app.shortcut.com/tiledb-inc/story/51945)

* Update changelog
Bump develop version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants