-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: time data #88
Comments
I see in _utils.parse_timestamp there is already some logic to guard against a day of In the test it looks like this timestamp Did someone from the Internet Archive indicate that this was a good way to handle the situation? Absent any other information I would have been inclined to rewrite the |
This commit extends the existing logic for handling invalid days of `00` to months that are `00`. It also adds a warning to be logged in both situations. So if a timestamp of `20200001120000` will get rewritten to `20200112000000` prior to conversion to a datetime. I have tested on live CDX API data that was failing, and this fix causes the full result to be returned. If more information is known about why this approach is taken it would be good to add in a comment? Closes edgi-govdata-archiving#88
We talked on Slack, but summarizing here for transparency…
Yep! Rolling the 24 over to a day of 01 and an hour of 00 was what we initially thought was right, but someone from the Archive found the actual archived content and it turned out that was not correct — somehow an extra IIRC, nobody was sure whether it was good to generalize or not, but since a
In the wild! More details in #85. |
This commit extends the existing logic for handling invalid days of `00` to months that are `00`. It also adds a warning to be logged in both situations. So if a timestamp of `20200001120000` will get rewritten to `20200112000000` prior to conversion to a datetime. I have tested on live CDX API data that was failing, and this fix causes the full result to be returned. If more information is known about why this approach is taken it would be good to add in a comment? Closes #88
This commit extends the existing logic for handling invalid days of `00` to months that are `00`. It also adds a warning to be logged in both situations. So if a timestamp of `20200001120000` will get rewritten to `20200112000000` prior to conversion to a datetime. I have tested on live CDX API data that was failing, and this fix causes the full result to be returned. If more information is known about why this approach is taken it would be good to add in a comment? Closes edgi-govdata-archiving#88
This commit extends the existing logic for handling invalid days of `00` to months that are `00`. It also adds a warning to be logged in both situations. For example, if a timestamp of `20200001120000` will get rewritten to `20200112000000` prior to conversion to a datetime. Fixes #88 Co-authored-by: Rob Brackett <[email protected]>
I happened to be doing this:
and noticed that after running for 10 minutes or so it blew up with:
It looks like the CDX API returned a datetime
20000008241731
which throws an exception during parse because00
isn't a valid month?I don't know what the solution is here:
The text was updated successfully, but these errors were encountered: