Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memento links should be in same mode as Memento #111

Closed
Mr0grog opened this issue Feb 26, 2023 · 0 comments · Fixed by #113
Closed

Memento links should be in same mode as Memento #111

Mr0grog opened this issue Feb 26, 2023 · 0 comments · Fixed by #113
Labels
bug Something isn't working

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Feb 26, 2023

In #108, I added a link property to Memento objects with parsed data from the Link HTTP headers of mementos. However, the links to other mementos in that data turn out to always be in view mode, regardless of the mode of the memento you requested!

For example:

from wayback import WaybackClient, Mode
client = WaybackClient()

memento = client.get_memento('https://epa.gov/', '20230210003633')

# Memento is in original mode:
memento.mode == Mode.original.value
# But the links are not:
memento.links == {
    'original': {
        'url': 'https://www.epa.gov/',
        'rel': 'original'
    },
    'timemap': {
        'url': 'https://web.archive.org/web/timemap/link/https://www.epa.gov/',
        'rel': 'timemap',
        'type': 'application/link-format'
    },
    'first memento': {
        # This URL is in `view` mode, not `original`!
        'url': 'https://web.archive.org/web/19970418120600/http://www.epa.gov:80/',
        'rel': 'first memento',
        'datetime': 'Fri, 18 Apr 1997 12:06:00 GMT'
  },
  # ...more links cut for brevity...
}

The suggested use for these links is to pass them directly to get_memento(), but that might get you a memento in a different mode than you expect! It’s a footgun.

Some options here:

  1. Drop the links attribute on Memento for now. Users can parse the Link header(s) themselves if they want it, and are responsible for using them appropriately. (In this case, we also need to reopen Add info about link header relationships to Memento #57.)

  2. Update the url field on any link that references a memento to match the mode of the Memento object they are attached to.

    Side note: how do we identify which things are mementos? Look for "memento" as a substring in the rel field? Look for url fields that match known memento URL patterns?

  3. Instead of the values in links being dictionaries, make them some more useful data object. References to other mementos might be more like our CdxRecord objects, where the url is the captured URL (e.g. http://www.epa.gov/ instead of the memento URL), the timestamp is a datetime object, etc.

    • This one’s pretty complicated! It’s how I envisioned this feature might evolve, but isn’t obviously worthwhile in the short term.
    • I don’t know the complete universe of possible object types (it’s not just mementos, see the first two entries in the example above) and technically what goes here is pretty arbitrary. How do we future-proof things we haven’t modeled yet?

I think (3) has too many open questions, but we should do (1) or (2) before cutting a 0.4.1 release.

@Mr0grog Mr0grog added the bug Something isn't working label Feb 26, 2023
Mr0grog added a commit that referenced this issue Feb 27, 2023
The `links` attribute on a Memento object links to other related resources, such as the first/previous/next/last memento of the URL. These are super useful for iterating and navigating through Mementos, BUT the archive.org servers always return these as links to the memento in "view" mode. Users of this library rarely use "view" mode, and if you are iterating through links in a different mode, simply following one of the links can lead to subtle mistakes! To mitigate the issue, this changes all the links that reference mementos to use the same mode as the Memento object they are attached to.

Fixes #111.
Mr0grog added a commit that referenced this issue Feb 27, 2023
The `links` attribute on a Memento object links to other related resources, such as the first/previous/next/last memento of the URL. These are super useful for iterating and navigating through Mementos, BUT the archive.org servers always return these as links to the memento in "view" mode. Users of this library rarely use "view" mode, and if you are iterating through links in a different mode, simply following one of the links can lead to subtle mistakes! To mitigate the issue, this changes all the links that reference mementos to use the same mode as the Memento object they are attached to.

Fixes #111.
Mr0grog added a commit that referenced this issue Mar 7, 2023
The `links` attribute on a Memento object links to other related resources, such as the first/previous/next/last memento of the current memento's URL. These are super useful for iterating and navigating through mementos, BUT the archive.org servers always return these as links to the memento in "view" mode. Users of this library rarely use "view" mode, and if you are iterating through links in a different mode, simply following one of the links can lead to subtle mistakes! To mitigate this, we now change all the links that reference mementos to use the same mode as the `Memento` object they are attached to.

Fixes #111.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant