This package is designed to infer relative cell type abundance and its variability across bulk tumor samples obtained from a multi-region sequencing design. ICeITH is a reference-based deconvolution method and it overcomes the limitations of current methods by modeling a patient-specific mean expression to account for the heterogeneity of gene expressions introduced from multi-region sequencing design. In addition, ICeITH measures the intratumor heterogeneity by quantifying the variability of targeted cellular composition and it potentially reveals the relation with the risk of patients’ survival.
To install the package:
install.packages('devtools')
devtools::install_github("pengyang0411/ICeITH")
To demonstrate the usage of ICeITH package, we provide a function
sim_func
to simulate the multi-region gene expression data and the
cell-type-specific reference profiles.
library(ICeITH)
simData <- sim_func(K = 4, ## Number of cell types
G = 500, ## Number of genes
lowS = 3, ## Minimal number of samples per subject
maxS = 5, ## Maximal number of samples per subject
N = 10, ## Number of patient subject
nRef = 100) ## Number of reference for each cell types
simData
is a list contains the cell-type-specific expression profies
as well as the mixed multi-region gene expression and the intratumor
heterogeneity for each patient subject.
Various other options are available and the detailed description of the output values are well documented in the help pages
?ICeITH::sim_func
The first step of the model estimation is to obtain the prior knowledge
(i.e., cell-type-specific mean expression and variability of each gene)
from the reference profile using refEst
function. It needs needs a
cell-type-specific gene expression matrix and a vector the label the
cell type for each sample.
## Estimate the reference
reference = refEst(simData$X_gr, ## Reference matrix
cts = simData$ct_s) ## Reference cell types
The estimation results are displayed:
For more details, please review the help page:
?ICeITH::refEst
The second step of the model estimation is to quantify the relative
cell-type abundance and the classification of intratumor heterogeneity
level by using ICeITH
function. It requires an input of multi-region
gene expression data from a cohort and the sample index to the patient
subject:
## Estimate the model
res_All = ICeITH(Y = simData$Y, ## Multi-region gene expression data
reference = reference, ## Prior knowledge from reference
sampIndex = simData$I_i, ## Sample index
maxIters = 20) ## Maximum number of iterations
## Loading required namespace: e1071
## Iteration: 1 objective value: -219259.3
## Iteration: 2 objective value: -218267.3
## Iteration: 3 objective value: -218261.6
## Iteration: 4 objective value: -218242.3
## Iteration: 5 objective value: -218242.3
## Iteration: 6 objective value: -218242.3
## Converged
The estimated relative cell-type abundance are displayed:
Peng Yang ([email protected])
Peng Yang, Shawna M. Hubert, P. Andrew Futreal, Xingzhi Song, Jianhua Zhang, J. Jack Lee, Ignacio Wistuba, Ying Yuan, Jianjun Zhang, Ziyi Li. “A novel Bayesian model for assessing intratumor heterogeneity of tumor infiltrating leukocytes with multiregion gene expression sequencing.” The Annals of Applied Statistics, 18(3) 1879-1898 September 2024. https://doi.org/10.1214/23-AOAS1862.