-
Notifications
You must be signed in to change notification settings - Fork 35
[SUGGESTION] Avoid correction of barcode names #288
Comments
I also found a solution to remove the '-1' from the fragments file; however the fastest I managed was 1 min/file (for average runs with ~5K cells). Also this is a bit risky if having more than a GEM well. |
Seems that this is coded here:
and three entries in the R version: vsn-pipelines/src/utils/bin/sc_file_converter.R Lines 123 to 127 in 91e5724
|
Yes, I added this so that it's easier to identify the cells w/o having to mask them first. |
That would be great (or at least giving it as an option)! I found solutions to work with the fragments file without it, but it slows things significantly: while it is true that normally we work with single GEM wells ('-1'), I can't assume it will always be like this. Keeping the index would make it very straight forward :) I guess this could also be problematic if you have aggregated runs in the 10x scRNA-seq results, where if removing the '-[0-9]' can result in repeated barcodes? I have some data to test this. |
@cbravo93 yes indeed would be better and more robust for later. Let's append the sample name to the complete cell barcode. |
@cbravo93 this is fixed in |
Is your feature request related to a problem? Please describe.
In 10X data, barcode names generally have '-[0-9]' at the end (e.g. ATGCTGCTCTA-1). I noticed that the number is removed in the pipeline, resulting in barcode-sample_id (e.g. ATGCTGCTCTA-Sample_1). However, for downstream analyses, and eventually working with fragments files for the multiome having the initial number is very relevant.
Describe the solution you'd like
Would it be possible to return the cell names as barcode-number-sample_id? E.g ATGCTGCTCTA-1-Sample_1
The text was updated successfully, but these errors were encountered: