Skip to content

ucl-medical-genomics/hygeia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hygeia

Note

Hygeia is still in Alpha. However, feel free to use the app on your data and reach out to us if you need any help.

Requirements

Two Group Analysis

To run the pipeline, after installing nextflow, you can run the following command. This will automatically download the latest version of the pipeline from Github.

Run the pipeline with a config file (recommended)

nextflow run ucl-medical-genomics/hygeia -c run.config

Note you can use -r <git_branch_name> to use a different github branch.

You would need to create a config file with the following paramaters defined. You can also overwrite any parameters in the default nextflow config file.

params.cpg_file_path = "/scratch/imoghul/hygeia_data/ref/cpg.tsv.gz"
params.sample_sheet = "/scratch/imoghul/hygeia_data/aging/sample_sheet.csv"
params.output_dir = "results"
params.meteor_mu = "0.95,0.05,0.8,0.2,0.50,0.50"
params.meteor_sigma = "0.05,0.05,0.1,0.1,0.1,0.2886751"
params.min_cpg_sites_between_change_points = 3
params.num_of_inference_seeds = 2

Run the pipeline without a config file All the paramaters in the config file can be set via the CLI. This may be useful for scripts.

nextflow run ucl-medical-genomics/hygeia \
  --cpg_file_path "/scratch/imoghul/hygeia_data/ref/cpg.tsv.gz" \
  --sample_sheet "/scratch/imoghul/hygeia_data/aging/sample_sheet.csv" \
  --output_dir "results" \
  --meteor_mu "0.95,0.05,0.8,0.2,0.50,0.50" \
  --meteor_sigma "0.05,0.05,0.1,0.1,0.1,0.2886751" \
  --min_cpg_sites_between_change_points 3 \
  --num_of_inference_seeds 2 \
  -with-report report.html -with-dag flowchart.pdf

Development

After running the above commands, the pipeline will be cloned into ~/.nextflow/assets/ucl-medical-genomics/hygeia. If you make any changes here, you will need to commit them before running the above command again.

You may want to add process.errorStrategy = 'terminate' to a local nextflow config to override the default (which is to ignore errors).

Release a new Docker build

The pipeline uses docker images on Github Docker Registry. If you make any changes to the underlying files including in the dockerfile, please push them to Dockerhub:

  1. Login using a token with access to Github Packages. See here for more info.
export CR_PAT=YOUR_TOKEN

echo $CR_PAT | docker login ghcr.io -u USERNAME --password-stdin
  1. Build docker image and upload to Github Packages
docker build -t hygeia/single_group src/single_group
docker build -t hygeia/two_group src/two_group

docker tag hygeia/single_group ghcr.io/ucl-medical-genomics/hygeia_single_group:v0.1.2
docker tag hygeia/single_group ghcr.io/ucl-medical-genomics/hygeia_single_group:latest

docker tag hygeia/two_group ghcr.io/ucl-medical-genomics/hygeia_two_group:v0.1.2
docker tag hygeia/two_group ghcr.io/ucl-medical-genomics/hygeia_two_group:latest

docker push ghcr.io/ucl-medical-genomics/hygeia_single_group:latest
docker push ghcr.io/ucl-medical-genomics/hygeia_single_group:v0.1.2
docker push ghcr.io/ucl-medical-genomics/hygeia_two_group:latest
docker push ghcr.io/ucl-medical-genomics/hygeia_two_group:v0.1.2

Tutorial - Run Single Group analysis with NA12878

TODO: Complete Tutorial

  1. Download data
curl -LO https://www.encodeproject.org/files/ENCFF608CXC/@@download/ENCFF608CXC.bigWig

https://www.encodeproject.org/files/ENCFF446HUA/@@download/ENCFF446HUA.bed.gz
  1. Run Hygeia
nextflow run ....