- Added QSEQ parsing function
parse_qseq
and iteratorQseqIterator
toskbio.parse.sequences
. - Added
strict
andlookup
optional parameters toskbio.stats.distance.mantel
for handling reordering and matching of IDs when providedDistanceMatrix
instances as input (these parameters were previously only available inskbio.stats.distance.pwmantel
). skbio.stats.distance.pwmantel
now accepts an iterable ofarray_like
objects. Previously, onlyDistanceMatrix
instances were allowed.- Added
read
andwrite
methods toDissimilarityMatrix
andDistanceMatrix
. These methods can support multiple file formats, automatic file format detection when reading, etc. by taking advantage of scikit-bio's I/O registry system. Seeskbio.io
andskbio.io.dm
for more details. Deprecatedfrom_file
andto_file
methods in favor ofread
andwrite
. These methods will be removed in scikit-bio 0.3.0. - Added
read
andwrite
methods toOrdinationResults
. These methods can support multiple file formats, automatic file format detection when reading, etc. by taking advantage of scikit-bio's I/O registry system. Seeskbio.io
andskbio.io.ordres
for more details. Deprecatedfrom_file
andto_file
methods in favor ofread
andwrite
. These methods will be removed in scikit-bio 0.3.0. - Added
skbio.stats.ordination.assert_ordination_results_equal
for comparingOrdinationResults
objects for equality in unit tests.
skbio.stats.distance.mantel
now returns a 3-element tuple containing correlation coefficient, p-value, and the number of matching rows/cols in the distance matrices (n
). The return value was previously a 2-element tuple containing only the correlation coefficient and p-value.skbio.stats.distance.mantel
reorders inputDistanceMatrix
instances based on matching IDs (see optional parametersstrict
andlookup
for controlling this behavior). In the past,DistanceMatrix
instances were treated the same asarray_like
input and no reordering took place, regardless of ID (mis)matches.array_like
input behavior remains the same.- If mismatched types are provided to
skbio.stats.distance.mantel
(e.g., aDistanceMatrix
andarray_like
), aTypeError
will be raised.
- Added git timestamp checking to checklist.py, ensuring that when changes are made to Cython (.pyx) files, their corresponding generated C files are also updated.
This is an initial alpha release of scikit-bio. At this stage, major backwards-incompatible API changes can and will happen. Many backwards-incompatible API changes were made since the previous release.
- Added ability to compute distances between sequences in a
SequenceCollection
object (#509), and expandedAlignment.distance
to allow the user to pass a function for computing distances (the default distance metric is stillscipy.spatial.distance.hamming
) (#194). - Added functionality to not penalize terminal gaps in global alignment. This functionality results in more biologically relevant global alignments (see #537 for discussion of the issue) and is now the default behavior for global alignment.
- The python global aligners (
global_pairwise_align
,global_pairwise_align_nucleotide
, andglobal_pairwise_align_protein
) now support aligning pairs of sequences, pairs of alignments, and a sequence and an alignment (see #550). This functionality supports progressive multiple sequence alignment, among other things such as adding a sequence to an existing alignment. - Added
StockholmAlignment.to_file
for writing Stockholm-formatted files. - Added
strict=True
optional parameter toDissimilarityMatrix.filter
. - Added
TreeNode.find_all
for finding all tree nodes that match a given name.
- Fixed bug that resulted in a
ValueError
fromlocal_align_pairwise_nucleotide
(see #504) under many circumstances. This would not generate incorrect results, but would cause the code to fail.
- Removed
skbio.math
, leavingstats
anddiversity
to become top level packages. For example, instead offrom skbio.math.stats.ordination import PCoA
you would now importfrom skbio.stats.ordination import PCoA
. - The module
skbio.math.gradient
as well as the contents ofskbio.math.subsample
andskbio.math.stats.misc
are now found inskbio.stats
. As an example, to import subsample:from skbio.stats import subsample
; to import everything from gradient:from skbio.stats.gradient import *
. - The contents of
skbio.math.stats.ordination.utils
are now inskbio.stats.ordination
. - Removed
skbio.app
subpackage (i.e., the application controller framework) as this code has been ported to the standalone burrito Python package. This code was not specific to bioinformatics and is useful for wrapping command-line applications in general. - Removed
skbio.core
, leavingalignment
,genetic_code
,sequence
,tree
, andworkflow
to become top level packages. For example, instead offrom skbio.core.sequence import DNA
you would now importfrom skbio.sequence import DNA
. - Removed
skbio.util.exception
andskbio.util.warning
(see #577 for the reasoning behind this change). The exceptions/warnings were moved to the following locations:
FileFormatError
,RecordError
,FieldError
, andEfficiencyWarning
have been moved toskbio.util
BiologicalSequenceError
has been moved toskbio.sequence
SequenceCollectionError
andStockholmParseError
have been moved toskbio.alignment
DissimilarityMatrixError
,DistanceMatrixError
,DissimilarityMatrixFormatError
, andMissingIDError
have been moved toskbio.stats.distance
TreeError
,NoLengthError
,DuplicateNodeError
,MissingNodeError
, andNoParentError
have been moved toskbio.tree
FastqParseError
has been moved toskbio.parse.sequences
GeneticCodeError
,GeneticCodeInitError
, andInvalidCodonError
have been moved toskbio.genetic_code
- The contents of
skbio.genetic_code
formerlyskbio.core.genetic_code
are now inskbio.sequence
. TheGeneticCodes
dictionary is now a functiongenetic_code
. The functionality is the same, except that because this is now a function rather than a dict, retrieving a genetic code is done using a function call rather than a lookup (so, for example,GeneticCodes[2]
becomesgenetic_code(2)
. - Many submodules have been made private with the intention of simplifying imports for users. See #562 for discussion of this change. The following list contains the previous module name and where imports from that module should now come from.
skbio.alignment.ssw
toskbio.alignment
skbio.alignment.alignment
toskbio.alignment
skbio.alignment.pairwise
toskbio.alignment
skbio.diversity.alpha.base
toskbio.diversity.alpha
skbio.diversity.alpha.gini
toskbio.diversity.alpha
skbio.diversity.alpha.lladser
toskbio.diversity.alpha
skbio.diversity.beta.base
toskbio.diversity.beta
skbio.draw.distributions
toskbio.draw
skbio.stats.distance.anosim
toskbio.stats.distance
skbio.stats.distance.base
toskbio.stats.distance
skbio.stats.distance.permanova
toskbio.stats.distance
skbio.distance
toskbio.stats.distance
skbio.stats.ordination.base
toskbio.stats.ordination
skbio.stats.ordination.canonical_correspondence_analysis
toskbio.stats.ordination
skbio.stats.ordination.correspondence_analysis
toskbio.stats.ordination
skbio.stats.ordination.principal_coordinate_analysis
toskbio.stats.ordination
skbio.stats.ordination.redundancy_analysis
toskbio.stats.ordination
skbio.tree.tree
toskbio.tree
skbio.tree.trie
toskbio.tree
skbio.util.misc
toskbio.util
skbio.util.testing
toskbio.util
skbio.util.exception
toskbio.util
skbio.util.warning
toskbio.util
- Moved
skbio.distance
contents intoskbio.stats.distance
.
- Relaxed requirement in
BiologicalSequence.distance
that sequences being compared are of equal length. This is relevant for Hamming distance, so the check is still performed in that case, but other distance metrics may not have that requirement. See #504). - Renamed
powertrip.py
repo-checking script tochecklist.py
for clarity. checklist.py
now ensures that all unit tests import from a minimally deep API. For example, it will produce an error ifskbio.core.distance.DistanceMatrix
is used overskbio.DistanceMatrix
.- Extra dimension is no longer calculated in
skbio.stats.spatial.procrustes
. - Expanded documentation in various subpackages.
- Added new scikit-bio logo. Thanks Alina Prassas!
This is a pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.
- Added Python implementations of Smith-Waterman and Needleman-Wunsch alignment as
skbio.core.alignment.pairwise.local_pairwise_align
andskbio.core.alignment.pairwise.global_pairwise_align
. These are much slower than native C implementations (e.g.,skbio.core.alignment.local_pairwise_align_ssw
) and as a result raise anEfficencyWarning
when called, but are included as they serve as useful educational examples as they’re simple to experiment with. - Added
skbio.core.diversity.beta.pw_distances
andskbio.core.diversity.beta.pw_distances_from_table
. These provide convenient access to thescipy.spatial.distance.pdist
beta diversity metrics from within scikit-bio. Theskbio.core.diversity.beta.pw_distances_from_table
function will only be available temporarily, until thebiom.table.Table
object is merged into scikit-bio (see #489), at which pointskbio.core.diversity.beta.pw_distances
will be updated to use that. - Added
skbio.core.alignment.StockholmAlignment
, which provides support for parsing Stockholm-formatted alignment files and working with those alignments in the context RNA secondary structural information. - Added
skbio.core.tree.majority_rule
function for computing consensus trees from a list of trees.
- Function
skbio.core.alignment.align_striped_smith_waterman
renamed tolocal_pairwise_align_ssw
and now returns anAlignment
object instead of anAlignmentStructure
- The following keyword-arguments for
StripedSmithWaterman
andlocal_pairwise_align_ssw
have been renamed:gap_open
->gap_open_penalty
gap_extend
->gap_extend_penalty
match
->match_score
mismatch
->mismatch_score
- Removed
skbio.util.sort
module in favor of natsort package.
- Added powertrip.py script to perform basic sanity-checking of the repo based on recurring issues that weren't being caught until release time; added to Travis build.
- Added RELEASE.md with release instructions.
- Added intersphinx mappings to docs so that "See Also" references to numpy, scipy, matplotlib, and pandas are hyperlinks.
- The following classes are no longer
namedtuple
subclasses (see #359 for the rationale):skbio.math.stats.ordination.OrdinationResults
skbio.math.gradient.GroupResults
skbio.math.gradient.CategoryResults
skbio.math.gradient.GradientANOVAResults
- Added coding guidelines draft.
- Added new alpha diversity formulas to the
skbio.math.diversity.alpha
documentation.
This is a pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.
- Added
enforce_qual_range
parameter toparse_fastq
(on by default, maintaining backward compatibility). This allows disabling of the quality score range-checking. - Added
skbio.core.tree.nj
, which applies neighbor-joining for phylogenetic reconstruction. - Added
bioenv
,mantel
, andpwmantel
distance-based statistics toskbio.math.stats.distance
subpackage. - Added
skbio.math.stats.misc
module for miscellaneous stats utility functions. - IDs are now optional when constructing a
DissimilarityMatrix
orDistanceMatrix
(monotonically-increasing integers cast as strings are automatically used). - Added
DistanceMatrix.permute
method for randomly permuting rows and columns of a distance matrix. - Added the following methods to
DissimilarityMatrix
:filter
,index
, and__contains__
for ID-based filtering, index lookup, and membership testing, respectively. - Added
ignore_comment
parameter toparse_fasta
(off by default, maintaining backward compatibility). This handles stripping the comment field from the header line (i.e., all characters beginning with the first space) before returning the label. - Added imports of
BiologicalSequence
,NucleotideSequence
,DNA
,DNASequence
,RNA
,RNASequence
,Protein
,ProteinSequence
,DistanceMatrix
,align_striped_smith_waterman
,SequenceCollection
,Alignment
,TreeNode
,nj
,parse_fasta
,parse_fastq
,parse_qual
,FastaIterator
,FastqIterator
,SequenceIterator
inskbio/__init__.py
for convenient importing. For example, it's now possible tofrom skbio import Alignment
, rather thanfrom skbio.core.alignment import Alignment
.
- Fixed a couple of unit tests that could fail stochastically.
- Added missing
__init__.py
files to a couple of test directories so that these tests won't be skipped. parse_fastq
now raises an error on dangling records.- Fixed several warnings that were raised while running the test suite with Python 3.4.
- Functionality imported from
skbio.core.ssw
must now be imported fromskbio.core.alignment
instead.
- Code is now flake8-compliant; added flake8 checking to Travis build.
- Various additions and improvements to documentation (API, installation instructions, developer instructions, etc.).
__future__
imports are now standardized across the codebase.- New website front page and styling changes throughout. Moved docs site to its own versioned subdirectories.
- Reorganized alignment data structures and algorithms (e.g., SSW code,
Alignment
class, etc.) into anskbio.core.alignment
subpackage.
Fixes to setup.py. This is a pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.
Initial pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.