Skip to content

Commit

Permalink
Merge pull request #369 from gbouras13/dev
Browse files Browse the repository at this point in the history
v1.7.4 with `--trna_scan_model`
  • Loading branch information
gbouras13 authored Nov 22, 2024
2 parents 121ffb8 + 2d91957 commit a1bd525
Show file tree
Hide file tree
Showing 12 changed files with 114 additions and 55 deletions.
9 changes: 5 additions & 4 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:

strategy:
matrix:
os: [macos-latest, ubuntu-latest]
os: [macos-13, ubuntu-latest] #macos-latest/macos-14 is M1 - some deps
python-version: ["3.9"]

steps:
Expand All @@ -22,25 +22,26 @@ jobs:
fetch-depth: 0

# Setup env
- uses: "conda-incubator/setup-miniconda@v2"
- uses: "conda-incubator/setup-miniconda@v3"
with:
activate-environment: pharokka_env
environment-file: environment.yml
python-version: ${{ matrix.python-version }}
auto-activate-base: false
miniforge-variant: Mambaforge
channels: conda-forge,bioconda,defaults
channel-priority: strict
auto-update-conda: true
- name: Install pharokka
shell: bash -l {0}
run: |
mamba install python=${{ matrix.python-version }}
conda install python=${{ matrix.python-version }}
python -m pip install --upgrade pip
pip install -e .
- name: Run tests and collect coverage
run: pytest --cov=./ --cov-report=xml
- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v3
with:
version: v0.7.3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
3 changes: 1 addition & 2 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,12 @@ jobs:

steps:
- uses: actions/checkout@v2
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.9
activate-environment: pharokka_env
environment-file: environment.yml
auto-activate-base: false
miniforge-variant: Mambaforge
channels: conda-forge,bioconda,defaults
channel-priority: strict
auto-update-conda: true
Expand Down
7 changes: 7 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
History
=======

1.7.4 (2024-11-23)
------------------

* Adds `--trna_scan_model` parameter with two accepted options: `--trna_scan_model general` (this will be run by default - what Pharokka has always been running) and `--trna_scan_model bacterial`. See the [tRNAscan-SE paper](https://doi.org/10.1093/nar/gkab688) for more information.
* Bumps the `dnaapler` dependency to v1.0.1 due to a breaking dependency change in `dnaapler`.


1.7.3 (2024-07-10)
------------------

Expand Down
43 changes: 30 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ If you don't want to install `pharokka` or `phold` locally, you can run `pharokk
- [Source](#source)
- [Database Installation](#database-installation)
- [Beginner Conda Installation](#beginner-conda-installation)
- [Beginner Conda Installation](#beginner-conda-installation-1)
- [Usage](#usage)
- [Version Log](#version-log)
- [System](#system)
Expand Down Expand Up @@ -294,35 +295,49 @@ which will create a directory called "pharokka_v1.4.0_databases" containing the

If you are new to using the command-line, please install conda using the following instructions.

1. Install [Anaconda](https://www.anaconda.com/products/distribution). I would recommend [miniconda](https://docs.conda.io/en/latest/miniconda.html).
2. Assuming you are using a Linux x86_64 machine (for other architectures, please replace the URL with the appropriate one on the [miniconda](https://docs.conda.io/en/latest/miniconda.html) website).

`curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh`
# Beginner Conda Installation

If you are new to using the command-line, please install conda using the following instructions.

1. Install Conda - I would recommend [miniforge](https://github.com/conda-forge/miniforge).
2. Assuming you are using a Linux x86_64 machine (for other architectures, please replace the URL with the appropriate one on the [miniforge](https://github.com/conda-forge/miniforge) repository).

`wget https://github.com/conda-forge/miniforge/releases/download/24.9.2-0/Miniforge3-24.9.2-0-Linux-x86_64.sh`

For Mac Intel:

`wget https://github.com/conda-forge/miniforge/releases/download/24.9.2-0/Miniforge3-24.9.2-0-MacOSX-x86_64.sh`

For Mac (Intel, will also work with M1):
For Mac M1/M2/M3/M4

`curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh`
`wget https://github.com/conda-forge/miniforge/releases/download/24.9.2-0/Miniforge3-24.9.2-0-MacOSX-arm64.sh`

3. Install miniconda and follow the prompts.
3. Install miniforge and follow the prompts.

`sh Miniconda3-latest-Linux-x86_64.sh`
`sh Miniforge3-24.9.2-0-Linux-x86_64.sh`

4. After installation is complete, you should add the following channels to your conda configuration:

```
```bash
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
```

5. After this, conda should be installed (you may need to restart your terminal). It is recommended that mamba is also installed, as it will solve the enviroment quicker than conda:
5. Finally, I would recommend installing pharokka into a fresh environment. For example to create an environment called pharokkaENV with pharokka installed:

`conda install mamba`
```bash
conda create -n pharokkaENV pharokka
conda activate pharokkaENV
install_databases.py -h
pharokka.py -h
```

6. Finally, I would recommend installing `pharokka` into a fresh environment. For example to create an environment called pharokkaENV with `pharokka` installed:
If you have a Mac with Apple Silicon (M1-M4), try

```
mamba create -n pharokkaENV pharokka
```bash
conda create --platform osx-64 -n pharokkaENV pharokka
conda activate pharokkaENV
install_databases.py -h
pharokka.py -h
Expand Down Expand Up @@ -399,6 +414,8 @@ options:
extra commands to pass to MINced (please omit the leading hyphen for the first argument). You will need to use quotation marks e.g. --minced_args "minNR 2 -minRL 21"
--mash_distance MASH_DISTANCE
mash distance for the search against INPHARED. Defaults to 0.2.
--trna_scan_model {general,bacterial}
tRNAscan-SE model
-V, --version Print pharokka Version
--citation Print pharokka Citation
```
Expand Down
34 changes: 23 additions & 11 deletions bin/input_commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,13 @@ def get_input():
default=0.2,
type=float,
)
parser.add_argument(
"--trna_scan_model",
help="tRNAscan-SE model",
choices=["general", "bacterial"],
default="general",
type=str,
)
parser.add_argument(
"-V",
"--version",
Expand Down Expand Up @@ -408,17 +415,22 @@ def check_dependencies(skip_mash):
else:
raise ValueError("MMseqs2 version not found")

mmseqs_major_version = int(mmseqs_version.split(".")[0])
mmseqs_minor_version = mmseqs_version.split(".")[1]
# Ryan Wick's code from Dnaapler - for prebuilt binary on github
if mmseqs_version.startswith("45111b6"):
logger.info(f"MMseqs2 version found is {mmseqs_version}")
else:

logger.info(
f"MMseqs2 version found is v{mmseqs_major_version}.{mmseqs_minor_version}"
)
mmseqs_major_version = int(mmseqs_version.split(".")[0])
mmseqs_minor_version = mmseqs_version.split(".")[1]

logger.info(
f"MMseqs2 version found is v{mmseqs_major_version}.{mmseqs_minor_version}"
)

if mmseqs_major_version != 13:
logger.error("MMseqs2 is the wrong version. Please install v13.45111")
if mmseqs_minor_version != '45111':
logger.error("MMseqs2 is the wrong version. Please install v13.45111")
if mmseqs_major_version != 13:
logger.error("MMseqs2 is the wrong version. Please install v13.45111")
if mmseqs_minor_version != '45111':
logger.error("MMseqs2 is the wrong version. Please install v13.45111")

logger.info("MMseqs2 version is ok.")

Expand Down Expand Up @@ -574,8 +586,8 @@ def check_dependencies(skip_mash):
f"Dnaapler version found is v{dnaapler_major_version}.{dnaapler_minor_version}.{dnaapler_minorest_version}"
)

if dnaapler_minor_version < 2:
logger.error("Dnaapler is the wrong version. Please re-install pharokka.")
if dnaapler_major_version < 1:
logger.error("Dnaapler is the wrong version. Please install Dnaapler v1.0.1 or higher.")

logger.info("Dnaapler version is ok.")

Expand Down
4 changes: 2 additions & 2 deletions bin/pharokka.py
Original file line number Diff line number Diff line change
Expand Up @@ -303,11 +303,11 @@ def main():
if args.skip_extra_annotations is False:
if args.meta == True:
logger.info("Starting tRNA-scanSE. Applying meta mode.")
run_trnascan_meta(input_fasta, out_dir, args.threads, num_fastas)
run_trnascan_meta(input_fasta, out_dir, args.threads, num_fastas, args.trna_scan_model)
concat_trnascan_meta(out_dir, num_fastas)
else:
logger.info("Starting tRNA-scanSE.")
run_trna_scan(input_fasta, args.threads, out_dir, logdir)
run_trna_scan(input_fasta, args.threads, out_dir, logdir, args.trna_scan_model)
# run minced and aragorn
run_minced(input_fasta, out_dir, prefix, args.minced_args, logdir)
run_aragorn(input_fasta, out_dir, prefix, logdir)
Expand Down
18 changes: 14 additions & 4 deletions bin/processes.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ def concat_phanotate_meta(out_dir, num_fastas):
outfile.write(infile.read())


def run_trnascan_meta(filepath_in, out_dir, threads, num_fastas):
def run_trnascan_meta(filepath_in, out_dir, threads, num_fastas, trna_scan_model):
"""
Runs trnascan to output gffs one contig per thread
:param filepath_in: input filepath
Expand All @@ -210,12 +210,17 @@ def run_trnascan_meta(filepath_in, out_dir, threads, num_fastas):
input_tmp_dir = os.path.join(out_dir, "input_split_tmp")
commands = []

if trna_scan_model == "general":
model = "G"
else: # bacterial
model = "B"

for i in range(1, num_fastas + 1):
in_file = "input_subprocess" + str(i) + ".fasta"
out_file = "trnascan_tmp" + str(i) + ".gff"
filepath_in = os.path.join(input_tmp_dir, in_file)
filepath_out = os.path.join(input_tmp_dir, out_file)
cmd = "tRNAscan-SE " + filepath_in + " --thread 1 -G -Q -j " + filepath_out
cmd = f"tRNAscan-SE {filepath_in} --thread 1 -{model} -Q -j {filepath_out}"
commands.append(cmd)

n = int(threads) # the number of parallel processes you want
Expand Down Expand Up @@ -632,7 +637,7 @@ def translate_fastas(out_dir, gene_predictor, coding_table, genbank_file):
# for genbank do nothing


def run_trna_scan(filepath_in, threads, out_dir, logdir):
def run_trna_scan(filepath_in, threads, out_dir, logdir, trna_scan_model):
"""
Runs trna scan
:param filepath_in: input filepath
Expand All @@ -643,11 +648,16 @@ def run_trna_scan(filepath_in, threads, out_dir, logdir):

out_gff = os.path.join(out_dir, "trnascan_out.gff")

if trna_scan_model == "general":
model = "G"
else: # bacterial
model = "B"

trna = ExternalTool(
tool="tRNAscan-SE",
input=f"{filepath_in}",
output=f"{out_gff}",
params=f"--thread {threads} -G -Q -j",
params=f"--thread {threads} -{model} -Q -j",
logdir=logdir,
outfile="",
)
Expand Down
2 changes: 1 addition & 1 deletion bin/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "1.7.3"
__version__ = "1.7.4"
39 changes: 24 additions & 15 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@ conda install -c bioconda pharokka

This will install all the dependencies along with `pharokka`. The dependencies are listed in environment.yml.

If conda is taking a long time to solve the environment, try using mamba:
If you have a MacOS system with M1/M2/M3 Apple Silicon, try this

```bash
conda install mamba
mamba install -c bioconda pharokka
conda create --platform osx-64 --name pharokkaENV -c bioconda pharokka
conda activate pharokkaENV
```

## Pip
Expand Down Expand Up @@ -100,18 +100,22 @@ which will create a directory called "pharokka_v1.4.0_databases" containing the

If you are new to using the command-line, please install conda using the following instructions.

1. Install [Anaconda](https://www.anaconda.com/products/distribution). I would recommend [miniconda](https://docs.conda.io/en/latest/miniconda.html).
2. Assuming you are using a Linux x86_64 machine (for other architectures, please replace the URL with the appropriate one on the [miniconda](https://docs.conda.io/en/latest/miniconda.html) website).
1. Install Conda - I would recommend [miniforge](https://github.com/conda-forge/miniforge).
2. Assuming you are using a Linux x86_64 machine (for other architectures, please replace the URL with the appropriate one on the [miniforge](https://github.com/conda-forge/miniforge) repository).

`wget https://github.com/conda-forge/miniforge/releases/download/24.9.2-0/Miniforge3-24.9.2-0-Linux-x86_64.sh`

For Mac Intel:

`curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh`
`wget https://github.com/conda-forge/miniforge/releases/download/24.9.2-0/Miniforge3-24.9.2-0-MacOSX-x86_64.sh`

For Mac (Intel, will also work with M1):
For Mac M1/M2/M3/M4

`curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh`
`wget https://github.com/conda-forge/miniforge/releases/download/24.9.2-0/Miniforge3-24.9.2-0-MacOSX-arm64.sh`

3. Install miniconda and follow the prompts.
3. Install miniforge and follow the prompts.

`sh Miniconda3-latest-Linux-x86_64.sh`
`sh Miniforge3-24.9.2-0-Linux-x86_64.sh`

4. After installation is complete, you should add the following channels to your conda configuration:

Expand All @@ -121,15 +125,20 @@ conda config --add channels bioconda
conda config --add channels conda-forge
```

5. After this, conda should be installed (you may need to restart your terminal). It is recommended that mamba is also installed, as it will solve the enviroment quicker than conda:
5. Finally, I would recommend installing pharokka into a fresh environment. For example to create an environment called pharokkaENV with pharokka installed:

`conda install mamba`
```bash
conda create -n pharokkaENV pharokka
conda activate pharokkaENV
install_databases.py -h
pharokka.py -h
```

6. Finally, I would recommend installing pharokka into a fresh environment. For example to create an environment called pharokkaENV with pharokka installed:
If you have a Mac with Apple Silicon (M1-M4), try

```bash
mamba create -n pharokkaENV pharokka
conda create --platform osx-64 -n pharokkaENV pharokka
conda activate pharokkaENV
install_databases.py -h
pharokka.py -h
```
```
4 changes: 4 additions & 0 deletions docs/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@ Note 2 things: (1) that you need to leave off the leading hyphen (i.e. `"minNR 2
pharokka.py -i <fasta file> -o <output folder> -d <path/to/database_dir> -t <threads> --minced_args "minNR 2 -minRL 21"
```

As of v1.7.4 you can specify the bacterial `tRNAscan-SE` model using `--trna_scan_model bacterial`. Otherwise, `pharokka` uses the general model by default. See the [tRNAscan-SE paper](https://doi.org/10.1093/nar/gkab688) for more information.


```bash
usage: pharokka.py [-h] [-i INFILE] [-o OUTDIR] [-d DATABASE] [-t THREADS] [-f] [-p PREFIX] [-l LOCUSTAG] [-g GENE_PREDICTOR] [-m] [-s]
Expand Down Expand Up @@ -217,6 +219,8 @@ options:
extra commands to pass to MINced (please omit the leading hyphen for the first argument). You will need to use quotation marks e.g. --minced_args "minNR 2 -minRL 21"
--mash_distance MASH_DISTANCE
mash distance for the search against INPHARED. Defaults to 0.2.
--trna_scan_model {general,bacterial}
tRNAscan-SE model
-V, --version Print pharokka Version
--citation Print pharokka Citation
```
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- minced >=0.4.2
- aragorn >=1.2.41
- mash >=2.2
- dnaapler >=0.4.0
- dnaapler >=1.0.1
- pyrodigal >=3.1.0
- pyrodigal-gv >=0.2.0
- pycirclize >=0.3.1
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ def package_files(directory):
long_description = fh.read()

setup(
name="Pharokka",
version="1.7.3",
name="pharokka",
version="1.7.4",
author="George Bouras",
author_email="[email protected]",
description="Fast phage annotation tool",
Expand Down

0 comments on commit a1bd525

Please sign in to comment.