We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New bug in v0.3.0a1:
Some Wayback redirects use a Location: header with a scheme and domain, e.g:
Location:
Location: http://web.archive.org/web/20201027215555id_/https://www.whitehouse.gov/administration/eop/ostp/about/student/faqs
But others don’t, e.g:
Location: /web/20201027215555id_/https://www.whitehouse.gov/ostp/about/student/faqs
The latter will cause Wayback v0.3.0a1 to fail when trying to parse the headers:
>>> import wayback >>> c = wayback.WaybackClient() >>> c.get_memento('http://web.archive.org/web/20201027215555id_/https://www.whitehouse.gov/administration/eop/ostp/about/student/faqs') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/rbrackett/Dev/datarescue/wayback/wayback/_client.py", line 724, in get_memento headers=Memento.parse_memento_headers(response.headers), File "/Users/rbrackett/Dev/datarescue/wayback/wayback/_models.py", line 285, in parse_memento_headers headers['Location'], _, _ = memento_url_data(raw_headers['Location']) File "/Users/rbrackett/Dev/datarescue/wayback/wayback/_utils.py", line 122, in memento_url_data raise ValueError(f'"{memento_url}" is not a memento URL') ValueError: "/web/20201027215555id_/https://www.whitehouse.gov/ostp/about/student/faqs" is not a memento URL
The text was updated successfully, but these errors were encountered:
Handle Location headers that are absolute paths
Location
761178a
Most redirects in Wayback redirect to a complete URL, with headers like: Location: http://web.archive.org/web/20201027215555id_/https://www.whitehouse.gov/administration/eop/ostp/about/student/faqs But some include only an absolute path, (which is still valid) e.g: Location: /web/20201027215555id_/https://www.whitehouse.gov/ostp/about/student/faqs We weren't correctly handling the latter case, leading to exceptions while parsing headers. Fixes #59.
Handle Location headers that are absolute paths (#60)
c2ca979
Most redirects in Wayback redirect to a complete URL, with headers like: Location: http://web.archive.org/web/20201027215555id_/https://www.whitehouse.gov/administration/eop/ostp/about/student/faqs But some include only an absolute path, (which is still valid) e.g: Location: /web/20201027215555id_/whitehouse.gov/ostp/about/student/faqs We weren't correctly handling the latter case, leading to exceptions while parsing headers. Fixes #59.
Successfully merging a pull request may close this issue.
New bug in v0.3.0a1:
Some Wayback redirects use a
Location:
header with a scheme and domain, e.g:But others don’t, e.g:
The latter will cause Wayback v0.3.0a1 to fail when trying to parse the headers:
The text was updated successfully, but these errors were encountered: