-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include md5sum in JSON or as other output #302
Comments
Hi aboffin< Thanks for highlighting this issue. I understand this is an important feature. The NCBI Datasets team is actively exploring the implementation of a checksum mechanism. I'll leave this issue open until it is addressed. All the best, Nuala A. O'Leary, PhD |
Hi @aboffin, We have added MD5 checksum files to our data packages, including dehydrated packages, as of October 2024. For your example, you could download the data and then validate the downloaded files as follows:
The text "OK" at the end of each line of output indicates that the calculated MD5 hash values match the hash values included in the file We have some more more information about this in our documentation: User-initiated validation using the MD5 checksum file Thanks again for opening this issue. Best, |
Thanks for including the md5 checksum, but does it also work with Ideally, it would be nice if the checksum would be performed automatically by |
Hi,
Thank you for your team's commendable work on
datasets
which finally provides a comprehensive and singular way to download data from NCBI, whereas previously one had to resort to a multitude of EUtils/Perl/Python scripts that output something almost, but not quite entirely unlike what we wanted, however reliability seems to be an issue as with other tools.Is there a way to check the integrity of the downloads? In the typical example that is given, this information does not exist:
I am perplexed that such a simple mechanism of checksum integrity was not provided considering that networks do fail and partial downloads may lead to, at best confusion and at worst incorrect results, when using such genomes for further analyses.
I see that issue #206 raised the same question but it was closed without any definitive answer regarding md5sum.
The text was updated successfully, but these errors were encountered: