This repository has been archived by the owner on Jun 22, 2022. It is now read-only.

Home

Jump to bottom

Andrea Telatin edited this page Sep 17, 2020 · 16 revisions

𝕤𝕖𝕢𝕗𝕦

A collection of Sequence FASTX Utilities, partly shipped with this repository and partly coming from external sources.

They have been originally built these principles:

Reading both FASTQ and FASTA sequences with the same parser
Parsing both name and comments from sequence headers (i.e. >Seq_name length=1200)
Supporting .gz input files, and possibly other compression formats
Supporting streams (standard input / standard output)
Native support for Illumina Paired-End libraries when needed

New scripts also adopt BioX::Seq.

Installation

Provided Tools

seqfu - core utility

Legacy tools (pre v. 0.7.0)

fu-cat, concatenate FASTX files
fu-grep, extract sequences by DNA pattern, by name or comment
fu-len, filter sequences by size
fu-count, count sequences
fu-rename, rename sequences with a prefix
fu-sort, sort sequences by size
pe-cat, concatenate paired-end files (error tolerant, can be used to repair broken PE)
pe-len, filter paired end sets by length
pe-grep, filter paired end sets
pe-ren, rename FASTQ files using barcodes or Illumina paired ends
n50, calculate N50, number of sequences, minimum, maximum and total length
interleafq, interleave and deinterleave paired sequences

🧬 SeqFU - a collection of tools to parse and manipulate FASTA and FASTQ files, supporting compressed input

Toggle table of contents Pages 12

Clone this wiki locally