GitHub - RaphaelRibes/samReader: The goal of the project is to extract relevant information from a mapping file (alignment of sequencing reads onto a reference genome).

                                ____                      __
   _____  ______   ____ ___    / __ \  ___   ______  ____/ /  ___    _____
  / ___/ / __  /  / __ __  \  / /_/ / / _ \ / __  / / __  /  / _ \  / ___/
 (__  ) / /_/ /  / / / / / / / _, _/ /  __// /_/ / / /_/ /  /  __/ / /
/____/  \____/  /_/ /_/ /_/ /_/ |_|  \___/ \____/  \____/   \___/ /_/

samReader is a Python tool designed to analyze SAM files, providing insights into partially mapped and unmapped reads, as well as detailed CIGAR string analysis.

Features

Precise Error Localization: Identifies the exact location and type of errors in the reads.
Summary Reports: Generates comprehensive summaries of the analyses in text under pdf format.
Detailed CIGAR Analysis: Provides detailed information about the CIGAR strings of the reads.
Chromosome-Specific Analysis: Generates separate directories for each chromosome, containing mapped, partially mapped, and unmapped reads.
Depth Analysis: Calculates the depth of coverage for each chromosome.
Evolution of the mapping quality: Displays the evolution of the mapping quality over the length of the chromosome.
Highly Customizable: Offers a wide range of options to customize the analysis (see how to use config.yaml).

Requirements

Python 3.13 or higher
```
sudo apt-get install python3.13-full
```
Texlive-full using
```
sudo apt-get install texlive-full
```
Required Python packages described in requirements.txt

Optional

I use xdg open to open the pdf file. If you want to use the --auto-open option, you need to install xdg-utils.

sudo apt-get install xdg-utils

Installation

Clone the Repository:

git clone https://github.com/RaphaelRibes/samReader.git

Navigate to the Directory:
```
cd samReader
```
Install the Virtual Environment (venv):
```
python3.13 -m venv .venv
```
Activate the venv:
```
source .venv/bin/activate
```
Install Requirements:
```
 pip install -r requirements.txt
```

Usage

The minimal command to run samReader is described like this:

bash samReader.sh -i /path/to/your/mapping.sam

But I recommend using, at least, verbose mode to see the progress of the analysis. To do so, you can use the following command:

bash samReader.sh -i /path/to/your/mapping.sam -v

How to use config.yaml

The config.yaml file is used to customize the analysis. You can change the following parameters:

version (default 1.6_2020-02-05): The version of the SAM file format. You can see every available version typing bash samReader.sh -h.
separator (default -): The separator used in the SAM file to dinstinguish the chromosome name from the read number. For exemple, in Clone1-153694, the separator is -. Make sure to use the same separator for each chromosome of your .sam file
mapq threshold (default 0): The minimum mapping quality to consider a read as mapped.
significant figures (default 2): The number of significant figures to display in the summary report.
bins (default 100): The number of bins to use for the mapping quality histogram. The higher the number, the more precise the histogram.
calculation method (default for depth median and for mapq mean): The method used to calculate the depth of coverage and mapping quality. You can choose between mean and median.
n ticks (default 10): The number of ticks to display on the x-axis of the mapping quality evolution plot. The higher the number, the more precise the graduation on the x-axis will be.

The program should work with the default parameters. If you change the parameters, their is no guarantee that the program will work so make sure to use corresponding parameters for your .sam file.

Options

-i or --input: Path to the input SAM file.
-o or --output: (Optional) Specify the output directory. If not provided, the output will be saved in the current directory. Doesn't work right now.
-t or --trusted: (Optional) Trust the input format without performing format checks.
-v or --verbose: (Optional) Enable verbose mode.
-a or --auto-open: (Optional) Open the summary report after the analysis.
-h or --help: Display the help message.

Output

The tool generates the following outputs:

Summary Report: A text file (summary.pdf) containing a summary of the analyses.
One directory for each chromosome containing the following files:
- Mapped Reads: A FASTA file (only_mapped.fasta) containing sequences of mapped reads.
- Partially Mapped Reads: A FASTA file (only_partially_mapped.fasta) containing sequences of partially mapped reads.
- Unmapped Reads: A FASTA file (only_unmapped.fasta) containing sequences of unmapped reads.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes.

License

This project is licensed under the MIT License. See the GNU General Public Licence file for details.

Acknowledgments

Special thanks to all contributors and the open-source community for their invaluable support.

ASCII art based on this generator https://patorjk.com/software/taag/

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
SAM_specs/1.6_2020-02-05		SAM_specs/1.6_2020-02-05
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
common_functions.py		common_functions.py
config.yaml		config.yaml
main.py		main.py
mapping.sam		mapping.sam
plotit.py		plotit.py
rapport.pdf		rapport.pdf
requirements.txt		requirements.txt
samReader.sh		samReader.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Requirements

Optional

Installation

Usage

How to use config.yaml

Options

Output

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

License

RaphaelRibes/samReader

Folders and files

Latest commit

History

Repository files navigation

Features

Requirements

Optional

Installation

Usage

How to use config.yaml

Options

Output

Contributing

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages