-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sci-dash incorrect information #29
Comments
Hi Yani, Good catch! It was indeed generating a mean/sum based on all 'raw' cells / ambient RNA (instead of just the filtered cells). I've fixed this in the latest commit and also made some other small changes to the sci-dash. Just pull the latest code, delete the sci-dash folder of your run and start the snakemake workflow again. It should re-generate just the sci-dash. Let me know if this fixed it for you! Best, Job |
I think I figured it out, it had to due with similar naming schematics and the regular expression used to retrieve the STARSolo files: ad29488 I.e. Pmor / Pmor_50percPEG were getting the wrong statistic files retrieved due to a wildcard search without the species. |
Hi Job, It did fix most of the stats except the successful read-pairs for the two samples in total being higher than the total input read pairs |
That indeed sounds a bit fishy. I'll try to check whether I'm counting some reads double somewhere. |
No the samples are unhashed |
Hi Job,
I'm getting some weird output in my sci-dash:
I went back to the STARsolo summary file, and the values in the sci-dash don't match what is written there. When ooking at these Summary stats, they are much more in line with the sci-dash output of an earlier version of the pipeline. I've added the JSON and the STARsolo Summary.csv content of the same sample below.
Best,
Yani
sci-dash JSON:
"sample_succes": {
"5mm_dsDNAse": {
"n_pairs_success": 373470356,
"sequencing_saturation": 0.751197,
"estimated_cells": 5738,
"total_mapped_reads": 131048010,
"total_unique_reads": 111547480,
"total_multimapped_reads": 19500530,
"total_correct_reads_genes": 90306169,
"total_exonic_reads": 41117200,
"total_intronic_reads": 49189000,
"total_intergenic_reads": 40741810,
"total_mitochondrial_reads": 0,
"total_exonicAS_reads": 2446128,
"total_intronicAS_reads": 7610787,
"mean_reads_per_cell": 2414,
"mean_genes_per_cell": 210,
"mean_umis_per_cell": 376
STARsolo Summary:
Number of Reads,185672402
Reads With Valid Barcodes,1
Sequencing Saturation,0.751197
Q30 Bases in CB+UMI,1
Q30 Bases in RNA read,0.93507
Reads Mapped to Genome: Unique+Multiple,0.705802
Reads Mapped to Genome: Unique,0.600776
Reads Mapped to GeneFull_Ex50pAS: Unique+Multiple GeneFull_Ex50pAS,0.486374
Reads Mapped to GeneFull_Ex50pAS: Unique GeneFull_Ex50pAS,0.441959
Estimated Number of Cells,5738
Unique Reads in Cells Mapped to GeneFull_Ex50pAS,71273128
Fraction of Unique Reads in Cells,0.868553
Mean Reads per Cell,12421
Median Reads per Cell,9947
UMIs in Cells,17618322
Mean UMI per Cell,3070
Median UMI per Cell,2504
Mean GeneFull_Ex50pAS per Cell,1596
Median GeneFull_Ex50pAS per Cell,1442
Total GeneFull_Ex50pAS Detected,20151
STARsolo summary of prior run (I think the switch from GeneFull to GeneFull_Ex50pAS explains the difference between versions):
Number of Reads,190841919
Reads With Valid Barcodes,1
Sequencing Saturation,0.730686
Q30 Bases in CB+UMI,1
Q30 Bases in RNA read,0.934432
Reads Mapped to Genome: Unique+Multiple,0.818547
Reads Mapped to Genome: Unique,0.676697
Reads Mapped to GeneFull: Unique+Multiple GeneFull,0.545113
Reads Mapped to GeneFull: Unique GeneFull,0.483755
Estimated Number of Cells,5758
Unique Reads in Cells Mapped to GeneFull,79965182
Fraction of Unique Reads in Cells,0.866167
Mean Reads per Cell,13887
Median Reads per Cell,11178
UMIs in Cells,21385715
Mean UMI per Cell,3714
Median UMI per Cell,3041
Mean GeneFull per Cell,1792
Median GeneFull per Cell,1629
Total GeneFull Detected,17842
The text was updated successfully, but these errors were encountered: