Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

Develop_atac #362

Merged
merged 138 commits into from
Nov 29, 2021
Merged

Develop_atac #362

merged 138 commits into from
Nov 29, 2021

Conversation

KrisDavie
Copy link
Member

No description provided.

- If --quiet has been passed on the command line, suppress printing of
any additional messages beyond those that come from Nextflow
- Detect if params.quiet exists during channels input and in the INIT
function
- Fixes #283
- These are currently single-threaded
- Move biomart step outside of the compute_qc_stats function (avoid
querying the server too frequently)
- Rename debarcode_10x_scatac_fastqs to barcode_10x_scatac_fastqs
- Additional options fixed
- Keep the barcodes in the fastq name for now
- Bwa mem uses the -C option to add barcodes from the fastq comments
(now added by the barcode_10x_scatac_fastqs.sh script in
singlecelltoolkit.
- Index process now passes through the input bam as output
- Simplify bwa main.nf with pipes
- Barcode, quality and corrected barcodes are now added into the fastq
comments field
- Extracts and corrects barcode in one step
- cleanup br input tuple to trimming
- Includes scripts for processing BioRad data
- Remove option to use an alternate temp directory (instead use NXF_TEMP
env variable, or map the /tmp volume in the container elsewhere)
- This may become a selectable option later on
- Load base config, and profile-specific settings separately
- Run MarkDuplicates, pipe the output to SortSam
- Use existing docker image
- Use MarkDuplicatesSpark process
- Mapping now writes a bam to disk after fixmate
- Marking duplicates is handled with Picard/GATK, also outputting a
coordinate sorted and indexed bam
- Derive readgroup from fastq prior to mapping; add this to the bam file
with bwa (used in Picard/GATK)
cflerin added 28 commits June 24, 2021 16:05
- collect() fragments and peaks files into a single channel to stage
them in the working directory
- Shorten the input tuple to only have the base filenames so that they
are read from the current working directory
- New docker image
- Updated process parameters, limit max polars threads to 6
- Enable sample-specific parameters
- Add qc documentation
- Minor updates to preprocessing docs
- Fix detectino of gzip whitelists
- Update docker container
- Updated polars and pyarrow packages
- Fix for segfault in atac saturation script
- Remap bam/fragments output to be compatible with getDataChannel
- Mix bam/fragments channels for input to qc steps
- In some tools there is little benefit to increasing the number of
threads beyond a certain number. Limits are set now to 6 threads for
adapter trimming and barcoding steps. This will allow more processes to
run in parallel.
- Reformat headings, restructure sections
- Update BioRad read details
- Check how many barcodes were corrected, throw error if the fraction
falls below a threhold (~50%)
- Update docker image
- Add params for max_mismatches and min_frac_bcs_to_find to the barcode
correction process
- Fix bug with gzip detection
- Two new keywords in the metatdata: hydrop_2x384 and hydrop_3x96.
- Hydrop barcode extraction runs separately for each type, passing the
parameter to the extract_hydrop_atac_barcode_from_R2_fastq.sh script
- #352
- Docker image update
- Fix params for saturation script
- When staging multiple cellranger fragments files, an input file
collision would occur (files are named identically). This is fixed by
adding a process to rename these files with the sample ID as a prefix.
- Input data is now in the proper format [sampleId, [bam, index], ... ]
@cflerin cflerin merged commit 066bbcf into develop Nov 29, 2021
@cflerin cflerin deleted the develop_atac branch November 29, 2021 21:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants