-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove space in genome header and special nt in hairpin for mirdeep2 … #79
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @sdjebali ! Couple of minor questions / suggestions on the code..
main.nf
Outdated
perl -ane 's/y/N/ig;print;' $hairpin > hairpin_yn.fa | ||
awk '{ gsub(/Y/,"N",\$0); gsub(/B/,"N",\$0); gsub(/K/,"N",\$0); gsub(/M/,"N",\$0); gsub(/R/,"N",\$0); gsub(/S/,"N",\$0); gsub(/W/,"N",\$0); print}' $hairpin > hairpin_ok.fa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're just adding other nucleotides beyond Y
here, right? Is there a reason that we can't just do:
perl -ane 's/[ybkmrsw]/N/ig;print;' $hairpin > hairpin_yn.fa
I'd be inclined to use this type of command instead as it's a lot more succinct / easy to read and also case-insensitive..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm not sure I see where you are removing spaces in the headers? Am I missing that somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for having a look
I am fine with the perl command instead of mine but maybe use a different name for the output file as we do not only remove Y nt
I am removing spaces in headers for the genome file, not the hairpin file, it is at another location in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am removing spaces in headers for the genome file, not the hairpin file, it is at another location in the code
Right, that's what I was wondering - but I can't see that in this pull request, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now added the actual command to remove spaces in genome headers
this is to take into account the files generated by the lint command
Co-authored-by: Phil Ewels <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
In order for mirdeep2 to work on the 11.1 version of the pig genome, I had to remove spaces in the genome headers and to replace special nucleotide characters by N in the hairpin file.
Many thanks to contributing to nf-core/smrnaseq!
Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested on pull requests (PRs).
PR checklist
nextflow run . -profile test,docker
).nf-core lint .
).docs
is updatedCHANGELOG.md
is updatedREADME.md
is updatedLearn more about contributing: https://github.com/nf-core/smrnaseq/tree/master/.github/CONTRIBUTING.md