Releases: nanoporetech/medaka
Releases · nanoporetech/medaka
v2.0.1
v2.0.0
Switched from tensorflow to pytorch.
Existing models for recent basecallers have been converted to the new format.
Pytorch format models contain a _pt
suffix in the filename.
Changed
- Inference is now performed using PyTorch instead of TensorFlow.
- The
medaka consensus
command has been renamed tomedaka inference
to reflect
its function in running an arbitrary model and avoid confusion withmedaka_consensus
. - The
medaka stitch
command has been renamed tomedaka sequence
to reflect its
function in creating a consensus sequence. - The
medaka variant
command has been renamed tomedaka vcf
to reflect its function
in consolidating variants and avoid confusion withmedaka_variant
. - Order of arguments to
medaka vcf
has been changed to be more consistent
withmedaka sequence
. - The helper script
medaka_haploid_variant
has been renamedmedaka_variant
to
save typing. - Make
--ignore_read_groups
option available to more medaka subcommands includinginference
.
Removed
- The
medaka snp
command has been removed. This was long defunct as diploid SNP calling
had been deprecated, andmedaka variant
is used to create VCFs for current models. - Loading models in hdf format has been deprecated.
- Deleted minimap2 and racon wrappers in
medaka/wrapper.py
.
Added
- Release conda packages for Linux (x86 and aarch64) and macOS (arm64).
- Option
--lr_schedule
allows using cosine learning rate schedule in training. - Option
--max_valid_samples
to set number of samples in a training validation batch.
Fixed
- Training models with DiploidLabelScheme uses categorical cross-entropy loss
instead of binary cross-entropy.
v1.12.1
(Probably) final version of medaka using tensorflow. Future versions will use
pytorch instead.
Fixed
- medaka_consensus: only keep bam tags if input file matches joint polishing pipeline.
- Pin numpy to <2.0.0.
Added
- Consensus and variant models lookup for v3.5.1 Dorado models.
v1.12.0
Fixed
- tandem: Use haplotag 0 in unphased mode.
- tandem: Don't run consensus if regions set is empty.
Added
- Models for version 5 basecaller models.
- Expose
sym_indels
option for training. - Expose
--min_mapq
minimum mapping quality alignment fitering option for medaka consensus. - tandem: Option
--ignore_read_groups
to ignore read groups present in input file. - Wrapper script
medaka_consensus_joint
and convenience tools (prepare_tagged_bam
,
get_model_dtypes
) to facilitate joint polishing with multiple datatypes.
v1.11.3
Added
- Consensus and variant models for v4.3.0 dorado models.
v1.11.2
Added
- Parsing model information from fastq headers output by Guppy and MinKNOW.
Changed
- Additional explanatory information in VCF INFO fields concerning depth calculations.
v1.11.1
Fixed
- Do not exit if model cannot be interpreted, use the default instead.
- An issue with co-ordinate handling in computing variants from alignments.
Added
- Ability to use basecaller model name as --model argument.
- Better handling or errors when running abpoa.
v1.11.0
Fixed
- Correct suffix of consensus file when
medaka_consensus
outputs a fastq.
Added
- Choice of model file can be introspected from input files. For BAM files the
read group (RG) headers are searched according to the dorado
specification,
whilst for .fastq files the comment section of a number of reads are checked
for corresponding read group information. In the latter case see README for
information on correctly converting basecaller output to .fastq whilst
maintaining the relevant meta information. medaka tools resolve_model
can display the model that would automatically
be used for a given input file.
Changed
- If no model is provided on command-line interface (medaka consensus,
medaka_consensus, and medaka_haploid_variant) automatic attempts will be made
to choose the appropriate model.
v1.10.0
Changed
- Tensorflow logging level no longer set from Python.
- spoa and parasail are now strict requirements.
Fixed
- Sort VCF before annotating in
medaka_haploid_variant
. - Ignore errors when deleting temporary files.
- The output of the first POA run not being used in the second iteration in smolecule command.
Added
- Support for Python 3.11.
--spoa_min_coverage
option to smolecule command.
Removed
- Support for Python 3.7.
v1.9.1
Fixed
- A long-standing bug in pileup_counts that manifests for single-position pileups on ARM64.