These tools provide way to inspect differential gene expresion and allow rational filtering and combinig results of fusion gene events with special address to cSVs cases.
Analysis 1) is performing differential gene expresion in cohort without appropriate control group
Analysis 2) is combining and filtering fusion genes pipelines results using metacaller approach
More detail about both methods will come soon. (manuscript in preparation)
Differential expression analysis for cSVs datasets, for example datasets with chromothripsis or chromoanasynthesis, where in principle no adequate control group exist.
- Download run_DE.R RScript from repository
- Copy inputs to run_DE.R RScript folder:
Input file is normalized expression table on log2 scale in tab separetated format (gene names in first column named "Gene_symbol") - Run run_DE.R RScript with parameter -h :
Rscript --vanilla run_DE.R -h
..to view list of parameters to run a script
Output file is in tab separated format with lines representing all differential expresed genes in dataset, and with columns for individual samples (up/down regulation tag is used)
This tool for fusion genes identification was develop with aim to rationaly filter and combine multiple results from different pipelines (EricScript, JAFFA, FusionCatcher) to maximize results reliability and to prevent call of false positive fusion events.
- Download run_MCaller.R from repository
- Copy all samples results from each pipelines to new folders ie. ~/PATH/eric , ~/PATH/fc , ~/PATH/jaffa
- Please name files prefix in each folder consistently ie. S1_eric.results.tsv, S1_jaffa.results.tsv, S1_fc.results.tsv
NOTE1: here "S1" is recognized as sample name, _ is important for sample name identification in string. Please provide _ inmediately after sample name
NOTE2: final data folder structure should look like:
.
├── eric
│ └── S1_eric.results.tsv
├── fc
│ └── S1_fc.results.tsv
└── jaffa
└── S1_jaffa.results.tsv
- Run run_MCaller.R RScript with parameter -h:
Rscript --vanilla run_MCaller.R -h
..to view list of parameters to run a script and provide all of them to run analysis
Output file is in tab separated format with lines representing fusion genes identified by at least 2 fusion genes prediction tools. Sample identificator, related chromosomes for both genes from fusion pairs and tags pointing identification to individual caller are noted in columns.