Skip to content

ucl-medical-genomics/gimmecpg-python

Repository files navigation

GIMMEcpg-python

Maintenance GitHub GitHub release (latest by date) GitHub Release Poetry Ruff pre-commit

About The Project

Python version of GIMMEcpg, developed with Polars and H2OAutoML

Getting Started

usage: main.py [-h] -i INPUT -o OUTPUT -r REF [-c MINCOV] [-d MAXDISTANCE]
[-k] [-a] [-t RUNTIME] [-m MAXMODELS] [-s]

Options for imputing missing CpG sites based on neighbouring sites:

-h, --help           show this help message and exit
-i, --input          Path to directory of bed files (make sure it contains only the bed files to be analysed)
-o, --output         Path to output directory
-r, --ref            Path to reference methylation file
-c, --minCov         Minimum coverage to consider methylation site as present. Default = 10
-d, --maxDistance    Maximum distance between missing site and each neighbour for the site to be imputed. Default = all sites considered
-k, --collapse       Choose whether to merge methylation sites on opposite strands together. Default = False
-a, --accurate       Choose between Accurate and Fast mode. Default = Fast
-t, --runTime        Time (seconds) to train model. Default = 3600s (2h)
-m, --maxModels      Maximum number of models to train within the time specified under --runTime. Excludes Stacked Ensemble models
-s, --streaming      Choose if streaming is required (for files that exceed memory). Default = False

Prerequisites

Installation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages