Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparison of DE outputs pre/post pseudobulking #8341

Closed
antoine4ucsd opened this issue Jan 19, 2024 · 4 comments
Closed

comparison of DE outputs pre/post pseudobulking #8341

antoine4ucsd opened this issue Jan 19, 2024 · 4 comments
Assignees

Comments

@antoine4ucsd
Copy link

Hello
I am analyzing a set of scRNAseq integrated with Harmony
my goal is to compare 2 groups of samples but trying to figure out the best approach...
Following this nice vignette
https://satijalab.org/seurat/articles/de_vignette

#1 without pseudobulking trying the following

data1<- FindMarkers(
        object = data,
        ident.1 = "grp1",
        ident.2 = "grp2",
        assay = "SCT")
data2<- FindMarkers(
        object = data,
        ident.1 = "grp1",
        ident.2 = "grp2",
        assay = "RNA")

resulting in these
Rplot

#2 DESEQ2 test does not work without pseudolbuk

data3 <- FindMarkers(
        object = data,
        ident.1 = "grp1",
        ident.2 = "grp2",
        assay = "RNA",
test.use = "DESeq2")

error
converting counts to integer mode
Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc, :
every gene contains at least one zero, cannot compute log geometric means

#3 Next I tried after aggregation / pseudobulking

dataagg_SCT <- AggregateExpression(data, assays = "SCT", return.seurat = T, group.by = c( "pid", "status"))
data3 <- FindMarkers(object = dataagg_SCT,
                                assay = "SCT",
                         ident.1 = "grp1",
                         ident.2 = "grp2",
                         test.use = "DESeq2")

dataagg_RNA <- AggregateExpression(data, assays = "RNA", return.seurat = T, group.by = c( "pid", "status"))
data4  <- FindMarkers(object = dataagg_RNA,
                                assay = "RNA",
                         ident.1 = "grp1",
                         ident.2 = "grp2",
                         test.use = "DESeq2")

resulting in these . the results are totally different as you can see and shifted toward grp1 (not centered to zero), which makes me believe I am missing a rescaling step? Wilcoxon test gives no meaningful results

Rplot012

All suggestions are very welcome to help refining these analyses. I was curious about getting DESEQ2 results but does not work without pseudobulking on my data

thank you in advance for your help

@camara-h
Copy link

camara-h commented Jan 19, 2024

Hi @antoine4ucsd, thanks for bringing that up.
Maybe I don`t have much to add in the troubleshooting part, but we are running into similar issues with our datasets.

I tried comparisons with three different datasets and would show two of them:

pseudo_s.obj <- AggregateExpression(s.object, assays = "RNA", slot = "counts", return.seurat = T, group.by = c("cell_type_res.1.6", "sample", "neck_region"))
pseudo_s.obj$cell_type_res.1.6.neck_region <- paste(pseudo_s.obj$cell_type_res.1.6, pseudo_s.obj$neck_region, sep = "_")
bulk.mono.de <- FindMarkers(object = pseudo_s.obj, 
                         ident.1 = "White-adipocyte-2_Deep", 
                         ident.2 = "White-adipocyte-2_Superficial",
                         test.use = "DESeq2")

image

pseudo_s.obj <- AggregateExpression(seurat_NIC, assays = "RNA", slot = "counts", return.seurat = T, group.by = c("integrated_snn_res.1.8", "orig.ident", "sample"))
pseudo_s.obj$cluster.sample <- paste(pseudo_s.obj$integrated_snn_res.1.8, pseudo_s.obj$sample, sep = "_")
bulk.mono.de <- FindMarkers(object = pseudo_s.obj, 
                         ident.1 = "g28", 
                         ident.2 = "g1",
                         test.use = "DESeq2")

image

Hope it`s anything helpful :)

@longmanz
Copy link
Contributor

Hi @antoine4ucsd ,

I suspect this is caused by the number of pseudo-bulked "cells" being too small. How many "cells "are there in your "grp1" and "grp2" group?

@antoine4ucsd
Copy link
Author

you are probably right. I have 4 samples in grp1 and 3 samples in grp2. Maybe avoid pseudo-bulking in that case?
thank you

@longmanz
Copy link
Contributor

Hi,
We have fixed this logFC issue in our v5.0.2 and subsequent versions (by implementing a normalization step when calculating the logFC even if the 'counts' slot is used for DE testing. see https://github.com/satijalab/seurat/releases for details). Thank you for reporting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants