Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove space in genome header and special nt in hairpin for mirdeep2 … #79

Merged
merged 9 commits into from
May 10, 2021
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## v1.1dev - [date]

* remove spaces in genome headers and replace special nt by N in hairpin file for mirdeep2 to work. Follow up from [[#69]] (https://github.com/nf-core/smrnaseq/pull/79)
* Accept custom genome and remove non-canonical letters in the genome. Thanks to @sdjebali. Follow up from [[#63]](https://github.com/nf-core/smrnaseq/pull/63)
* Fix error when only one sample is in the input [[#31]](https://github.com/nf-core/smrnaseq/issues/31)
* Change `--reads` to `--input` for consistency with rest of nf-core
Expand Down
4 changes: 2 additions & 2 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -840,15 +840,15 @@ process mirdeep2 {

script:
"""
perl -ane 's/y/N/ig;print;' $hairpin > hairpin_yn.fa
awk '{ gsub(/Y/,"N",\$0); gsub(/B/,"N",\$0); gsub(/K/,"N",\$0); gsub(/M/,"N",\$0); gsub(/R/,"N",\$0); gsub(/S/,"N",\$0); gsub(/W/,"N",\$0); print}' $hairpin > hairpin_ok.fa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're just adding other nucleotides beyond Y here, right? Is there a reason that we can't just do:

perl -ane 's/[ybkmrsw]/N/ig;print;' $hairpin > hairpin_yn.fa

I'd be inclined to use this type of command instead as it's a lot more succinct / easy to read and also case-insensitive..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I'm not sure I see where you are removing spaces in the headers? Am I missing that somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for having a look

I am fine with the perl command instead of mine but maybe use a different name for the output file as we do not only remove Y nt

I am removing spaces in headers for the genome file, not the hairpin file, it is at another location in the code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am removing spaces in headers for the genome file, not the hairpin file, it is at another location in the code

Right, that's what I was wondering - but I can't see that in this pull request, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now added the actual command to remove spaces in genome headers


miRDeep2.pl \\
$reads_collapsed \\
$refgenome \\
$reads_vs_refdb \\
$mature \\
none \\
hairpin_yn.fa \\
hairpin_ok.fa \\
-d \\
-z _${reads_collapsed.simpleName}
"""
Expand Down