Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using SCT assays #136

Closed
mckoch234 opened this issue Jun 30, 2024 · 6 comments
Closed

Using SCT assays #136

mckoch234 opened this issue Jun 30, 2024 · 6 comments

Comments

@mckoch234
Copy link

mckoch234 commented Jun 30, 2024

I am getting this error. Error in x[[i, drop = TRUE]]:
! 'counts' not found in this Seurat object
Did you mean "nCount_SCT"?

How do I get it to recognize the counts? The seurat objects I'm working with have multiple assays but even when I change the assay it gives me that error.

@evanbiederstedt
Copy link
Collaborator

Hi @mckoch234 CC @saketkc

I'm guessing these are changes with Seurat V5: #133

I didn't change the codebase as I was hoping others would chip in.

We could probably make things V5 compatible. @saketkc or others in the Satija Lab might be able to lend a helping hand as well.

Please detail the issues you've having; an email might be necessary.

Best, Evan

@mckoch234
Copy link
Author

mckoch234 commented Jul 9, 2024 via email

@mckoch234
Copy link
Author

mckoch234 commented Jul 9, 2024 via email

@evanbiederstedt
Copy link
Collaborator

@mckoch234

Without having the data you are working with, I cannot really debug this given the information you've provided.

Here is the function:

conos/R/integrations.R

Lines 365 to 523 in 91110ba

#' Utility function to generate a pagoda2 app from a conos object
#'
#' @param conos Conos object
#' @param cdl list Optional list of raw matrices (so that gene merging doesn't have to be redone) (default=NULL)
#' @param metadata list Optional list of (named) metadata factors (default=NULL)
#' @param filename string Name of the *.bin file to seralize for the pagoda2 application if save=TRUE (default='conos_app.bin')
#' @param save boolean Save serialized *bin file specified in filename (default=TRUE)
#' @param n.cores integer Number of cores (default=1)
#' @param n.odgenes numeric Number of top overdispersed genes to use (dfault=3e3). From pagoda2::basicP2proc().
#' @param nPcs numeric Number of PCs to use (default=100). From pagoda2::basicP2proc().
#' @param k numeric Default number of neighbors to use in kNN graph (default=30). From pagoda2::basicP2proc().
#' @param perplexity numeric Perplexity to use in generating tSNE and largeVis embeddings (default=50). From pagoda2::basicP2proc().
#' @param log.scale boolean Whether to use log scale normalization (default=TRUE). From pagoda2::basicP2proc().
#' @param trim numeric Number of cells to trim in winsorization (default=10). From pagoda2::basicP2proc().
#' @param keep.genes optional set of genes to keep from being filtered out (even at low counts) (default=NULL). From pagoda2::basicP2proc().
#' @param min.cells.per.gene numeric Minimal number of cells required for gene to be kept (unless listed in keep.genes) (default=0). From pagoda2::basicP2proc().
#' @param min.transcripts.per.cell numeric Minimumal number of molecules/reads for a cell to be admitted (default=100). From pagoda2::basicP2proc().
#' @param get.largevis boolean Whether to caluclate largeVis embedding (default=TRUE). From pagoda2::basicP2proc().
#' @param get.tsne boolean Whether to calculate tSNE embedding (default=TRUE). From pagoda2::basicP2proc().
#' @param make.geneknn boolean Whether pre-calculate gene kNN (for gene search) (default=TRUE). From pagoda2::basicP2proc().
#' @param go.env GO environment for the organism of interest (default=NULL)
#' @param cell.subset string Cells to subset with the conos embedding conos$embedding. If NULL, uses all cells via rownames(conos$embedding) (default=NULL)
#' @param max.cells numeric Limit to the cells that are included in the conos. If Inf, there is no limit (default=Inf)
#' @param additional.embeddings list Additional embeddings to add to conos for the pagoda2 app (default=NULL)
#' @param test.pathway.overdispersion boolean Find all IDs using GO category against either org.Hs.eg.db ('hs') or org.Mm.eg.db ('mm') (default=FALSE
#' @param organism string Organism of interest, either 'hs' (Homo sapiens) or 'mm' (Mus musculus, i.e. mouse) (default=NULL). Only used if test.pathway.overdispersion is TRUE. If NULL and test.pathway.overdispersion=TRUE, then 'hs' is used.
#' @param return.details boolean If TRUE, return list of p2 application, pagoda2 object, list of raw matrices, and cell names. If FALSE, simply return pagoda2 app object. (default=FALSE)
#' @return pagoda2 app object
#'
#' @export
p2app4conos <- function(conos, cdl=NULL, metadata=NULL, filename='conos_app.bin',
save=TRUE, n.cores=1, n.odgenes=3e3, nPcs=100, k=30, perplexity=50, log.scale=TRUE, trim=10,
keep.genes=NULL, min.cells.per.gene=0, min.transcripts.per.cell=100, get.largevis=TRUE,
get.tsne=TRUE, make.geneknn=TRUE, go.env=NULL, cell.subset=NULL, max.cells=Inf,
additional.embeddings=NULL, test.pathway.overdispersion=FALSE, organism=NULL, return.details=FALSE) {
if (!requireNamespace("pagoda2", quietly = TRUE)) {
stop("You have to install the pagoda2 package to use p2app4conos(). Please refer to <https://github.com/kharchenkolab/pagoda2>.")
}
if (!requireNamespace("GO.db", quietly = TRUE)) {
stop("Package \"GO.db\" is needed for this function to work. Please install it from Bioconductor.", call. = FALSE)
}
if (!requireNamespace("org.Hs.eg.db", quietly = TRUE)) {
stop("Package \"org.Hs.eg.db\" is needed for this function to work. Please install it from Bioconductor.", call. = FALSE)
}
if (!requireNamespace("org.Mm.eg.db", quietly = TRUE)) {
stop("Package \"org.Mm.eg.db\" is needed for this function to work. Please install it from Bioconductor.", call. = FALSE)
}
if (!requireNamespace("AnnotationDbi", quietly = TRUE)) {
stop("Package \"AnnotationDbi\" is needed for this function to work. Please install it from Bioconductor.", call. = FALSE)
}
if (is.null(conos$embedding)){
stop("The input Conos object must contain an embedding. Please generate with the method embedGraph().")
}
if (is.null(cdl)) {
#cdl <- lapply(conos$samples,function(p) t(p$misc$rawCounts))
cdl <- lapply(rawMatricesWithCommonGenes(conos),t)
}
samf <- lapply(conos$samples,function(x) rownames(x$counts))
samf <- as.factor(setNames(rep(names(samf),unlist(lapply(samf,length))),unlist(samf)))
if (!is.null(cell.subset)) {
cell.subset <- intersect(cell.subset,rownames(conos$embedding))
} else {
cell.subset <- rownames(conos$embedding)
}
samf <- droplevels(samf[names(samf) %in% cell.subset])
cdl <- lapply(cdl, function(x) x[,colnames(x) %in% cell.subset,drop=FALSE])
# limit to the cells that are included in the conos
vc <- unlist(lapply(cdl,colnames))
if (length(vc)>max.cells) { # subsample
cat("subsampling",length(vc),"cells down to",max.cells,'...')
vc <- sample(vc,max.cells)
cat('done\n')
}
cdl <- lapply(cdl,function(d) d[,colnames(d) %in% intersect(vc,names(samf))])
cm <- do.call(cbind,cdl)
cp2 <- pagoda2::basicP2proc(cm, n.cores=n.cores, n.odgenes=n.odgenes, nPcs=nPcs, k=k, perplexity=perplexity, log.scale=log.scale, min.cells.per.gene=min.cells.per.gene, min.transcripts.per.cell=min.transcripts.per.cell, get.largevis=get.largevis, get.tsne=get.tsne, make.geneknn=make.geneknn)
if (test.pathway.overdispersion) {
if (is.null(organism) || organism =='hs') {
ids <- unlist(lapply(mget(colnames(cp2$counts),org.Hs.eg.db::org.Hs.egALIAS2EG,ifnotfound=NA),function(x) x[1]))
rids <- names(ids)
names(rids) <- ids
# list all the ids per GO category
go.env <- list2env(eapply(org.Hs.eg.db::org.Hs.egGO2ALLEGS, function(x) as.character(na.omit(rids[x]))))
} else if (organism =='mm') {
# translate gene names to ids
ids <- unlist(lapply(mget(colnames(cp2$counts),org.Mm.eg.db::org.Mm.egALIAS2EG,ifnotfound=NA),function(x) x[1]))
# reverse map
rids <- names(ids)
names(rids) <- ids
# list all the ids per GO category
go.env <- list2env(eapply(org.Mm.eg.db::org.Mm.egGO2ALLEGS, function(x) as.character(na.omit(rids[x]))))
} else {
stop("Unknown organism. Currently only 'hs' (human) and 'mm' (mouse) supported.")
}
}
if (!is.null(go.env)) {
#cp2$getHierarchicalDiffExpressionAspects(type='PCA',clusterName='community',z.threshold=3)
# here we test for pathway overdispersion, but recursive diff expression, as shown above will work just as well
cp2$testPathwayOverdispersion(go.env,verbose=TRUE,correlation.distance.threshold=0.95,recalculate.pca=FALSE,top.aspects=15)
termDescriptions <- AnnotationDbi::Term(GO.db::GOTERM[names(go.env)]) # saves a good minute or so compared to individual lookups
geneSets <- lapply(sn(names(go.env)),function(x) {
list(properties=list(locked=TRUE,genesetname=x,shortdescription=as.character(termDescriptions[x])),genes=c(go.env[[x]]))
})
} else {
hdea <- cp2$getHierarchicalDiffExpressionAspects(type='PCA',z.threshold=3)
geneSets <- pagoda2::hierDiffToGenesets(hdea)
}
# add all kinds of embeddings
# various joint embeddding versions
#cp2$embeddings$Conos <- list("All"=t(conO$embedding),"cochlea"=t(conC$embedding),"DRG"=t(conD$embedding),"Merged"=t(conO$embedding),"Refined"=t(con2$embedding))
cn <- rownames(cp2$counts) # cell names
cp2$embeddings$Conos <- list("All"=conos$embedding[cn,])
# hack: add placeholder spaces, so that old p2 version picks up the new embeddings
cp2$reductions$Conos <- list()
cp2$embeddings$sample <- lapply(conos$samples,function(x) { em <- x$embeddings$PCA[[1]]; em[rownames(em) %in% cn,] }) # embeddings of the individual samples
cp2$embeddings$sample <- cp2$embeddings$sample[!unlist(lapply(cp2$embeddings$sample,is.null))]
if (length(cp2$embeddings$sample)>1) {
cp2$reductions$sample <- list()
} else {
cp2$embeddings$sample <- NULL
}
if (!is.null(additional.embeddings)) {
cp2$embeddings$Other <- lapply(additional.embeddings, function(em) { em[rownames(em) %in% cn,] })
cp2$reductions$Other <- list()
}
# additional metadata with different factors .. you probably want to include something like sample or tissue (or patient type)
metadata <- c(metadata,list(sample=samf))
metadata <- lapply(metadata, function(d) d[cn])
meta <- lapply(sn(names(metadata)),function(n) pagoda2::p2.metadata.from.factor(droplevels(as.factor(metadata[[n]])),displayname=n))
p2app <- pagoda2::make.p2.app(cp2, dendrogramCellGroups = as.factor(conos$clusters[[1]]$groups[cn]), additionalMetadata = meta, geneSets = geneSets,innerOrder='odPCA')
# Optional showing of app
#show.app(p2app, name='newPagoda',browse=FALSE)
# Save serialised web object, RDS app and session image
if(save){
p2app$serializeToStaticFast(binary.filename = filename,verbose=TRUE)
}
if(return.details) {
return(list(app=p2app,p2=cp2,cdl=cdl,cn=cn))
} else {
invisible(p2app)
}
}

If you run the sections line-by-line, you might be able to track down the bug. That would be helpful. Otherwise, please email the data to me so I could help out.

It's not something seen before.

Best, Evan

@mckoch234
Copy link
Author

mckoch234 commented Jul 9, 2024 via email

@mckoch234
Copy link
Author

mckoch234 commented Jul 9, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants