Skip to content

Releases: KamilSJaron/smudgeplot

Oriel

24 Sep 14:10
f12ec1e
Compare
Choose a tag to compare

The big changes are

the search for the kmer pair will be within both canonical and non-canonical k-mer sets (Gene demonstrated it makes a difference)
the tool will be supporting FastK kmer counter only
the backend by Gene is paralelized and massively faster
the intermediate file will be a flat file with the 2d histogram with cov1, cov2, freq columns (as opposed to list of coverages of pairs cov1 cov2);
at least for now WE LOSE the ability to extract sequences of the kmers in the pair; this functionality will hopefully restore at some point together with functionality to assess the quality of assembly.
the smudge detection algorithm is under revision and a new version will be released on 18th of October 2024

Double-hung with curtains

21 Feb 00:51
Compare
Choose a tag to compare
  • fixed issue with L and U being too close to each other. Smudgeplot simply creates a wild plot of the data that are fed to it regardless of being harder for interpretation (aligned with "honest data reporting" philosophy of smudgeplot, but might cause more confusion, perhpa we will add some more warnings in the future)

Double-Hung

01 Aug 16:36
Compare
Choose a tag to compare

Adding a new feature smudgeplot extract for extracting kmer pairs from a rectangle of the smudgeplot.

Great thanks to @zhenzhenyang-psu !

Documentation: https://github.com/KamilSJaron/smudgeplot/wiki/smudgeplot-extract
For usage see smudgeplot.py extract -h

Still Single Hung

07 Feb 16:16
Compare
Choose a tag to compare

This release is just to get it to Zenodo, otherwise identical to v0.2.2.

Single-Hung

24 Jan 17:08
Compare
Choose a tag to compare

This version updates:

  • the annotation algorithm for higher ploidy levels based on simulations.
  • encourages using of our KMC that speeds up the search for kmer pairs a lot
  • adding a new warnings for mismatching estimates of 1n coverage by different approaches (was silent before)
  • change in terminology, instead of "estimated ploidy" we say "proposed ploidy", as there is no model explicitly tested

Single-Hung

08 Aug 15:09
Compare
Choose a tag to compare
  • fixed logging (now it's directed to err stream)
  • an estimate of ploidy based on all smudges of that ploidy (instead of the ploidy of the brightest smudge)
  • smudgeplot interface uses .py suffix to meet community standards

Single-Hung

14 May 19:10
Compare
Choose a tag to compare

This version is using the same computational backend as the previous version (0.1.3), but it's wrapped in a single interface that is expected to be kept in future:

smudgeplot <task> <arguments>

Further adjustments:

  • improved algorithm for placing smudges on the plot for higher ploidy levels than 4
  • alternative algorithm for extracting kmers available (--middle in hetkmers task)

I had no idea how to name the release, so I have decided to name individual versions of smudgeplots by types of windows, so let's start simple: Single-Hung it is. Hopefully, it will be good enough name to carry all the smudges.

beta3 - More modest algorithm for guessing 1n coveage, no default filtering

18 Oct 08:12
Compare
Choose a tag to compare
  • no quantile filtering by default, a new parameter -q to set up the filter, the falg --no-qunatile-filtering was removed (set -q 0.99 to set up the previously default filter)
  • algorithm for peak identification was failing if AAB was absent but AAAABB was present. This problem should be resolved by now by considering both the diploid and triploid 1n estimates.
  • fixed the installation instructions
  • minor fixes

beta2 release

27 Sep 16:39
Compare
Choose a tag to compare
  • switch to colorblind friendly palette
  • parameter nbins is autoscaling by default and fixed if defined by user
  • added an option to disable quantile filtering (--no_quantile_filt flag)
  • improved interface & README

beta release

22 Sep 09:20
Compare
Choose a tag to compare

smudgeplots are right now

  • production ready (fingers crossed)
  • scaling for any genome size and read set (tested on a dataset of 1,000,000,000 kmers)
  • fully automated (tested on set of 25 obscure genomes)
  • equipped with rich set of warnings whenever the inference seems strange