-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sketching files containing many small sequences: manysketch
is astonishingly fast
#3252
Comments
and on a further somewhat unrelated note, |
and even more so, to add a sketch it is faster to
than it is to run |
Try using |
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm trying to sketch the RVDB, the Reference Viral Genome Database. The clustered file is ~600 MB.
took about 5 minutes.
didn't finish in 24 hours.
what's the reason!? By my understanding
manysketch
isn't multithreaded when reading single FASTA files, so it's not multithreading. Presumably just the Python for loop penalty and/or using screed!? Wow.On a mostly unrelated note, the sig.zip file is larger than the FASTA file. So that sucks.
The text was updated successfully, but these errors were encountered: