Skip to content

northwestwitch/cosmic2bed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cosmic2bed

Howto and script to convert Cosmic_MutantCensus and Cosmic_NonCodingVariants .TSV files into bigbed tracks that can be used in IGV.

Created files correspond to the UCSC COSMIC tracks described here

Howto

Install this software

  1. Create a conda environment with Python 3 -> conda create -n py3 python=3.11
  2. Activate the environment -> conda activate py3
  3. Clone this repository -> git clone https://github.com/northwestwitch/cosmic2bed.git
  4. Enter cloned folder -> cd cosmic2bed
  5. Install poetry -> pip install poetry
  6. Install this software -> poetry install
  7. Make sure the script works -> poetry run cosmic2bed --help
image

COSMIC data availability

Cosmic data should be downloaded from COSMIC. Note that you need to register as a non-commercial user or have a commercial license in order to download COSMIC data.

Howto using demo data

Demo data present in this repository consists of 2 files: Cosmic_MutantCensus_v100_GRCh38.tsv and Cosmic_NonCodingVariants_v100_GRCh38.tsv, both present in the .tar sample download in build 38 obtained from COSMIC. These files can be found in the cosmic2bed/demo/infiles folder.

Convert a demo .tsv to a sorted bed

Demo outfiles were created in this way:

poetry run cosmic2bed -i cosmic2bed/demo/infiles/Cosmic_MutantCensus_v100_GRCh38.tsv -o cosmic2bed/demo/outfiles/Cosmic_MutantCensus_v100_GRCh38.bed --build 38

This command will convert the .tsv file to a 6+3 BED file.

Convert the sorted BED to bigbed

The sorted BED file created in the step above can be converted to bigbed using the bedToBigBed utility from UCSC. The utility can also be installed using conda. In this example I've used the script present in the cosmic2bed/scripts folder (don't use it and download the script specific for your architecture from UCSC instead) and runned the following command:

./cosmic2bed/scripts/bedToBigBed -type=bed6+3 -as=<path-to-bedplus-definitions> <path-to-sorted-bed-infile> <path-to-chrom-sizes> <path-to-sorted-bigbed-outfile> -tab
  • path-to-bedplus-definitions: use the path to the bedPlus definitions -> cosmic2bed/resources/bedPlus_definitions.as
  • path-to-sorted-bed-infile: it's the sorted BED file obtained in the step above
  • path-to-chrom-sizes: A file with chromosome sizes is present in this repository under cosmic2bed/resources. Choose the right genome build.
  • path-to-sorted-bigbed-outfile: It's the outfile, for instance Cosmic_MutantCensus_v100_GRCh38.bb

About

Convert TSV COSMIC format to bed file

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published