revise unit tests `tidySingleCellExperiment` #79

HelenaLC · 2023-08-05T19:12:02Z

revised tests (including plotting) bringing us to ~90% coverage
tests are a little more complex/data-driven now (e.g., no hard coded numbers, in case we change the example data)
also fixed a potential bug in rename() (renaming special columns, e.g., .cell, previously gave an error that the column doesn't exists, but should work as intended now)
also updated the example data (pbmc_small) via BiocGenerics::updateObject()) as it was outdated, causing some tests to fail (due to differences in internal metadata)

stemangiola

Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe?

I am trying to get rid of dplyr pipe wherever I can.

stemangiola · 2023-08-06T07:17:32Z

R/dplyr_methods.R

@@ -275,12 +274,13 @@ mutate.SingleCellExperiment <- function(.data, ...) {
 rename.SingleCellExperiment <- function(.data, ...) {

    # Check that we are not modifying a key column
-    cols <- tidyselect::eval_select(expr(c(...)), colData(.data) %>% as.data.frame())
-
+    df <- as_tibble(.data)


This would be, I believe more inefficient than getting colData directly, as it integrates also PC, UMAP etc..

maybe could you help me understand the nature of this 3-line change? So I can have a bit more context.

Yes, so I came across this while writing unit tests. Basically, if you try with the current version something like rename(sce, new_name=.cell), it will pass without an informative warning, but then fails because .cell is not an actual column.

I think it works for other functions, but not here, because the syntax is rename(..., new=old) (unlike most other verbs, e.g., mutate(..., new=value), we want to block the OLD columns here, I think?), so cols from using eval_select() will correspond to the left-hand side names, not the ones being changed (right-hand side).

What made it clear to me was adding a print(cols) in line 279 and see what's happening vs. what happens with using eval_rename() instead (which returns indices)... Of course, would be great to find a way without using as_tibble(), but I couldn't come up with one :/ Though if we want to limit cost here, we could just do, e.g., as_tibble(.data[, 1]).

Actually, I am not sure if maybe you head in mind doing both, i.e., not letting someone rename a column to .cell, but also not trying to rename .cell? I that case, we'd need a combination of both eval_select() and something else like the above, but more elegantly perhaps.

I see.

So .cell is an immutable column.

A solution can be

colData() |> as_tibble(rownames = EXISTING_INTERNAL_FUNCTION_THAT_ENCODES_CELL_COLNAME)

rename

drop .cell column so it's untouched

warning ("tidySingleCellExperiment says: ...") If .cell is in the query (there could be multiple columns in the query)

Actually, I am not sure if maybe you head in mind doing both, i.e., not letting someone rename a column to .cell, but also not trying to rename .cell?

Pretty much this.

stemangiola · 2023-08-06T07:24:26Z

revised tests (including plotting) bringing us to ~90% coverage

tests are a little more complex/data-driven now (e.g., no hard coded numbers, in case we change the example data)

also fixed a potential bug in rename() (renaming special columns, e.g., .cell, previously gave an error that the column doesn't exists, but should work as intended now)

also updated the example data (pbmc_small) via BiocGenerics::updateObject()) as it was outdated, causing some tests to fail (due to differences in internal metadata)

Thanks @HelenaLC ! That's great. I left 3 comments with some little changes or explanations.

Love the PR.

HelenaLC · 2023-08-06T08:36:28Z

Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe?

I am trying to get rid of dplyr pipe wherever I can.

Yes, absolutely! I actually prefer not using any pipes internally, but didn't want to fiddle too much with the style. But agreed, I'll use base R pipes instead of %>%.

(Side note: perhaps we could have a brief chat re linting. I am mainly focusing on Bioc-guidelines plus some simplifications, and I am leaning towards more base R vs. tidy-style, which I think would make things cleaner (and briefer) in many (not all) cases, at least internally...)

… tibble

stemangiola · 2023-08-06T10:51:14Z

Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe?
I am trying to get rid of dplyr pipe wherever I can.

Yes, absolutely! I actually prefer not using any pipes internally, but didn't want to fiddle too much with the style. But agreed, I'll use base R pipes instead of %>%.

Base pipes after R evaluation they disappear. It is just a visual-coding aid. So makes no difference for R.

(Side note: perhaps we could have a brief chat re linting. I am mainly focusing on Bioc-guidelines plus some simplifications, and I am leaning towards more base R vs. tidy-style, which I think would make things cleaner (and briefer) in many (not all) cases, at least internally...)

Yes thta's a more complex topic. We could discuss about that. It is good to respect standards, but we are practically contributing to changing them. And it is becoming evident to many. When I think about standards, I try to imagine what they will be in 5-10 years. Things are moving fast :)

Personal opinion: I think pipes make the code more readable in virtually all cases.

HelenaLC · 2023-08-06T12:13:42Z

Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe?
I am trying to get rid of dplyr pipe wherever I can.

Yes, absolutely! I actually prefer not using any pipes internally, but didn't want to fiddle too much with the style. But agreed, I'll use base R pipes instead of %>%.

Base pipes after R evaluation they disappear. It is just a visual-coding aid. So makes no difference for R.

(Side note: perhaps we could have a brief chat re linting. I am mainly focusing on Bioc-guidelines plus some simplifications, and I am leaning towards more base R vs. tidy-style, which I think would make things cleaner (and briefer) in many (not all) cases, at least internally...)

Yes thta's a more complex topic. We could discuss about that. It is good to respect standards, but we are practically contributing to changing them. And it is becoming evident to many. When I think about standards, I try to imagine what they will be in 5-10 years. Things are moving fast :)

Personal opinion: I think pipes make the code more readable in virtually all cases.

Absolutely. I did not mean to speak against pipes in general (illusional or not). Currently, I am trying my best to keep them for workflow-like operations, where they do make things easier to write and read. And only not use them when it really is a lot easier to follow (in my opinion), e.g., I struggled more to understand things like ... |> colnames() |> duplicated() |> which() |> length() |> gt(0) or ... |> duplicated() |> any() |> '!'() as opposed to !any(duplicated(colnames(...))), which literally reads "not any duplicated colnames". All that to say: Yes, I agree that they are easier to read in many cases, but perhaps not all (and also more cumbersome to debug at times...).

Independent of personal opinions / style-preferences, my case against using dplyr/tidyr inside packages (unless it does make things a LOT easier) is mainly because they are quite volatile at times. I have had to make quite frequent packages updates because of some syntax changes, deprecations etc. throwing warnings. So find it best to avoid them for simple things when possible without much magic.

Aaanyways, hoping to finish a first iteration first (towards a clean-ish BiocCheck) and we can see from there and change things that are not appreciated. Like I said, I am not against pipes (use them a lot!) and am keeping them in the vast majoirty of cases (like ~90%). Mainly trying to assure consistent use of <-/=, spacing, line-width, not using @ for accession, etc.

*One example re the above, when() has been deprecated in favor of base R if (...) ... else(...) ... :/

stemangiola · 2023-08-06T12:19:21Z

is mainly because they are quite volatile at times. I have had to make quite frequent packages updates because of some syntax changes, deprecations etc

true

HelenaLC · 2023-08-14T09:27:30Z

Just wondering before I finalize the next one, what's happening with this PR? Is there something you'd like to be changed explicitly (and perhaps kept for another PR) before merging?

stemangiola · 2023-08-14T10:03:46Z

I see.

So .cell is an immutable column.

A solution can be

colData() |> as_tibble(rownames = EXISTING_INTERNAL_FUNCTION_THAT_ENCODES_CELL_COLNAME)

rename

drop .cell column so it's untouched

warning ("tidySingleCellExperiment says: ...") If .cell is in the query (there could be multiple columns in the query)

Hello @HelenaLC, is the above still pending? I think this is the last part missing, to be modified fro your PR. All the rest look fantastic.

HelenaLC · 2023-08-14T12:02:40Z

Alright. I implemented a fix that I think does what it should, i.e., special columns (.cell, PCs etc.) cannot be renamed NOR be used as target names; it is a little more complicated than what was there before, but does the trick.

stemangiola · 2023-08-15T04:31:48Z

Alright. I implemented a fix that I think does what it should, i.e., special columns (.cell, PCs etc.) cannot be renamed NOR be used as target names; it is a little more complicated than what was there before, but does the trick.

Great, thanks for that! We were not getting the expected error, so I slightly changed the logic.

Also, in the future, we can check expect_error specifying the expected error message.

Thanks for this PR!

revise unit tests

78abc43

stemangiola self-requested a review August 6, 2023 07:22

stemangiola reviewed Aug 6, 2023

View reviewed changes

stemangiola added bug Something isn't working enhancement New feature or request labels Aug 6, 2023

stemangiola assigned HelenaLC Aug 6, 2023

use base R pipe; bug fix: +make.names() when extracting features from…

07b8741

… tibble

stemangiola linked an issue Aug 9, 2023 that may be closed by this pull request

Improve unit testing of tidySingleCellExperiment tidyomics/tidyomics#5

Closed

bug fix; rename() blocks special columns for both to & from

de8dbb9

stemangiola added 4 commits August 15, 2023 14:11

update logics as the error we were getting was unexpected

61570ad

fix mutate error message

a202e5f

fix check

6196313

test specify error messages

02dcbf3

stemangiola merged commit d05b3fd into stemangiola:master Aug 15, 2023

stemangiola changed the title ~~revise unit tests~~ revise unit tests tidySingleCellExperiment Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revise unit tests `tidySingleCellExperiment` #79

revise unit tests `tidySingleCellExperiment` #79

HelenaLC commented Aug 5, 2023 •

edited

Loading

stemangiola left a comment

stemangiola Aug 6, 2023

stemangiola Aug 6, 2023

HelenaLC Aug 6, 2023 •

edited

Loading

HelenaLC Aug 6, 2023

stemangiola Aug 6, 2023 •

edited

Loading

stemangiola Aug 9, 2023

stemangiola commented Aug 6, 2023

HelenaLC commented Aug 6, 2023 •

edited

Loading

stemangiola commented Aug 6, 2023 •

edited

Loading

HelenaLC commented Aug 6, 2023 •

edited

Loading

stemangiola commented Aug 6, 2023

HelenaLC commented Aug 14, 2023

stemangiola commented Aug 14, 2023

HelenaLC commented Aug 14, 2023

stemangiola commented Aug 15, 2023

revise unit tests tidySingleCellExperiment #79

revise unit tests tidySingleCellExperiment #79

Conversation

HelenaLC commented Aug 5, 2023 • edited Loading

stemangiola left a comment

Choose a reason for hiding this comment

stemangiola Aug 6, 2023

Choose a reason for hiding this comment

stemangiola Aug 6, 2023

Choose a reason for hiding this comment

HelenaLC Aug 6, 2023 • edited Loading

Choose a reason for hiding this comment

HelenaLC Aug 6, 2023

Choose a reason for hiding this comment

stemangiola Aug 6, 2023 • edited Loading

Choose a reason for hiding this comment

stemangiola Aug 9, 2023

Choose a reason for hiding this comment

stemangiola commented Aug 6, 2023

HelenaLC commented Aug 6, 2023 • edited Loading

stemangiola commented Aug 6, 2023 • edited Loading

HelenaLC commented Aug 6, 2023 • edited Loading

stemangiola commented Aug 6, 2023

HelenaLC commented Aug 14, 2023

stemangiola commented Aug 14, 2023

HelenaLC commented Aug 14, 2023

stemangiola commented Aug 15, 2023

revise unit tests `tidySingleCellExperiment` #79

revise unit tests `tidySingleCellExperiment` #79

HelenaLC commented Aug 5, 2023 •

edited

Loading

HelenaLC Aug 6, 2023 •

edited

Loading

stemangiola Aug 6, 2023 •

edited

Loading

HelenaLC commented Aug 6, 2023 •

edited

Loading

stemangiola commented Aug 6, 2023 •

edited

Loading

HelenaLC commented Aug 6, 2023 •

edited

Loading