Skip to content
This repository has been archived by the owner on Jun 22, 2022. It is now read-only.
Andrea Telatin edited this page Sep 17, 2020 · 16 revisions

Build Status Miniconda seqfu logo

𝕀𝕖𝕒𝕗𝕦

A collection of Sequence FASTX Utilities, partly shipped with this repository and partly coming from external sources.

They have been originally built these principles:

  • Reading both FASTQ and FASTA sequences with the same parser
  • Parsing both name and comments from sequence headers (i.e. >Seq_name length=1200)
  • Supporting .gz input files, and possibly other compression formats
  • Supporting streams (standard input / standard output)
  • Native support for Illumina Paired-End libraries when needed

New scripts also adopt BioX::Seq.

Installation

Provided Tools

Legacy tools (pre v. 0.7.0)

  • fu-cat, concatenate FASTX files
  • fu-grep, extract sequences by DNA pattern, by name or comment
  • fu-len, filter sequences by size
  • fu-count, count sequences
  • fu-rename, rename sequences with a prefix
  • fu-sort, sort sequences by size
  • pe-cat, concatenate paired-end files (error tolerant, can be used to repair broken PE)
  • pe-len, filter paired end sets by length
  • pe-grep, filter paired end sets
  • pe-ren, rename FASTQ files using barcodes or Illumina paired ends
  • n50, calculate N50, number of sequences, minimum, maximum and total length
  • interleafq, interleave and deinterleave paired sequences
Clone this wiki locally