Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHCN Subset: /Daily #332

Open
gabefair opened this issue Feb 24, 2017 · 8 comments
Open

GHCN Subset: /Daily #332

gabefair opened this issue Feb 24, 2017 · 8 comments

Comments

@gabefair
Copy link
Collaborator

gabefair commented Feb 24, 2017

This dataset is a part of a giant dataset of The Global Historical Climatology Network (GHCN) #331. We needed to break that one up into parts.

  • Agency: NOAA
  • Data Size: 5.329 TB
  • FTP/HTTP URL: ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/
    Recommended command: wget --mirror --timestamping --page-requisites --adjust-extension --no-parent --convert-links -e robots=off -output-file=ftp_ncdc_noaa_gov_pub_data_ghcn_daily.log ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/*
Name Size Ticket where addressed
all 26.044787 GB #333
by_year 14.349867 GB #334
File:COOPDaily_announcement_042011.doc 34 KB This issue ticket
File:COOPDaily_announcement_042011.pdf 123 KB This issue ticket
File:COOPDaily_announcement_042011.rtf 67 KB This issue ticket
figures 7.834 MB This issue ticket
File:ghcnd_all.tar.gz 2.920341 GB "
File:ghcnd-countries.txt 4 KB "
File:ghcnd_gsn.tar.gz 143.311 MB "
File:ghcnd_hcn.tar.gz 284.850 MB "
File:ghcnd-inventory.txt 26.351 MB "
File:ghcnd-states.txt 2 KB "
File:ghcnd-stations.txt 8.468 MB "
File:ghcnd-version.txt 1 KB "
grid 5.374317 GB "
gsn 905.501 MB "
hcn 2.479658 GB "
papers 7.613 MB "
File:readme.txt 24 KB "
File:status.txt 30 KB "
superghcnd 5.276572125 TB #336

Sizes were computed using lftp du -a command

Once you are done:

  • Please post the sha256 hash results of the files.
    Recommended command: hashdeep -erl
  • Please compute the sizes using du -b --max-depth=1 --human-readable or ls -l
  • And if your copy is online, a link to the mirror.
@fyvekatz
Copy link

I'm going to download this one.

@gabefair
Copy link
Collaborator Author

@fyvekatz How is it coming? Its pretty large, I was thinking about breaking it up into smaller tickets. Would that help you?

@fyvekatz
Copy link

Coming along. Ran into issues with failed file transfers and directories not wanting to list when I first attempted to download. Seems to be running smoothly at the moment. 23 gigs download so far. Server seems to support a download speed of about 18 MB/S. Limited my download to 5MB/s.

@gabefair gabefair changed the title GHCN Daily GHCN subset: /Daily Feb 25, 2017
@fyvekatz
Copy link

Update. 99.99+% of the files, and about 8% of the data. Still downloading.

@gabefair
Copy link
Collaborator Author

perfect, once you are done, please share:

  1. The method/command you used to download
  2. The md5deep hash of the folder
  3. a url to the mirror or copy of the files (if you have it, np if you dont)

@gabefair gabefair changed the title GHCN subset: /Daily GHCN Subset: /Daily Feb 25, 2017
@fyvekatz
Copy link

Will do. I have about 2.6 TB to go.

@fyvekatz
Copy link

fyvekatz commented Mar 2, 2017

Believe I have 100%. Doing a final run through to make sure. Will post a web link and a magnet link soon.

@fyvekatz
Copy link

fyvekatz commented Mar 7, 2017

http://fyvekatz.asuscomm.com:8888/ftp.ncdc.noaa.gov

Will add a torrent magnet and that hash output in a moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants