-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tax summarize
#2333
add tax summarize
#2333
Conversation
Codecov Report
@@ Coverage Diff @@
## latest #2333 +/- ##
==========================================
+ Coverage 83.98% 84.06% +0.08%
==========================================
Files 129 130 +1
Lines 14969 15059 +90
Branches 2192 2212 +20
==========================================
+ Hits 12572 12660 +88
Misses 2103 2103
- Partials 294 296 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
ready for review @bluegenes ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!!
This PR adds a
tax summarize
command per #2212.It also:
tax annotate
as taxonomy spreadsheets (provide taxonomy operations that work on semicolon-separated lineages #2185)sourmash tax prepare
fails withNo taxonomic identifiers found.
#2326Tackles #2212
Tackles #2185
Tackles parts of #2326
TODO
tax crosscheck
to compare databases and lineages for correctness #2361tax crosscheck --db db --taxonomy <taxonomy>
that will tell us which identifiers don't have taxonomy, and which taxonomy entries don't have sketches? - punted to addtax crosscheck
to compare databases and lineages for correctness #2361Example output
Running on a traditional taxonomy file:
On a gather-with-lineages file:
On the bad CSV file from #2326 -
CSV output of per-rank information
With CSV output,
and
aaa.csv
looks like: