Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml:base in Atom feeds does not appear to be respected #566

Open
numist opened this issue Apr 8, 2022 · 5 comments
Open

xml:base in Atom feeds does not appear to be respected #566

numist opened this issue Apr 8, 2022 · 5 comments

Comments

@numist
Copy link

numist commented Apr 8, 2022

I'm using jekyll-postfiles to keep content (like images) local to the post, and the local references work with a feed reader thanks to an xml:base attribute on the content tag emitted by jekyll-feed.

Unfortunately those links are failing per netlify-plugin-checklinks:

9:37:55 PM:   ✖ FAIL load _site/f001.jpg
9:37:55 PM:   | operator: load
9:37:55 PM:   | expected: 200 _site/f001.jpg
9:37:55 PM:   |   actual: ENOENT: no such file or directory, open '/opt/build/repo/_site/f001.jpg'
9:37:55 PM:   |       at: _site/feed.xml:7:14 (inlined Html) <img src="f001.jpg" alt="A photo of the circuit board with component F001 circled (near the unpopulated twin inductors)">

Is this more of a hyperlink problem, or is it HTML-only?

@numist
Copy link
Author

numist commented Apr 8, 2022

For anyone running into this in the future, add to your netlify.toml:

[[plugins]]

 package = "netlify-plugin-checklinks"
   [plugins.inputs]
   skipPatterns = [
     "_site/feed.xml",
   ]

main.css is also giving me grief for some yet-undetermined reason, so you might want to add that too if you're using checkExternal = true like I am.

@Munter
Copy link
Owner

Munter commented Apr 26, 2022

@numist

Sounds like the missing inclusion of xml:base in the link resolution belongs in Assetgraph either in the RSS or the Atom asset type.

Could you create a reduced test case with one html-file that links to a feed, which links to an image in this manner, so we can add it to the Assetgraph test cases and base a path upon it?

The CSS issue you mention lacks a bit too much information for me to act upon. But you are welcome to open another issue with it so we can have a look if you found a bug, or if our error messages are maybe just too cryptic :P

@numist
Copy link
Author

numist commented Apr 28, 2022

I just added an exception for all css links for now, I'll open a new issue with a reduced example when I restyle my site.

The below will need some massaging to fit your test suite's architecture but here's some reduced code from my own website. Both files validate, the html with https://validator.w3.org/nu/ and the xml with https://validator.w3.org/feed/.

feed.xml:

The important things to modify for testing here are xml.feed.id (self-referential URI), xml.feed.entry.content[xml:base] (baseurl, obv), and xml.feed.entry.content.p.img[src] (relative path to test image from baseurl). Probably xml.feed.entry.id should point to something valid? I would have left it out, but it's required by Atom.

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" >
  <updated>2022-04-27T09:26:27+00:00</updated>
  <id>https://numi.st/feed.xml</id>
  <title type="html">my cool website title</title>
  <entry>
    <title type="html">my cool entry title</title>
    <updated>2022-04-21T00:00:00+00:00</updated>
    <id>https://numi.st/post/2022/travel-uke</id>
    <content type="html" xml:base="https://numi.st/post/2022/travel-uke/">
      <![CDATA[<p><img src="IMG_1232.jpeg" /></p>]]>
    </content>
    <author><name>Not Blank</name></author>
  </entry>
</feed>

index.html:

Obv html.head.link[href] needs to point at the xml file above

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>my site title</title>
    <link type="application/atom+xml" rel="alternate" href="https://numi.st/feed.xml" />
  </head>
  <body></body>
</html>

@Munter
Copy link
Owner

Munter commented Apr 29, 2022

@papandreou I'm pretty sure we have the correct modeling of an inline HTML-fragment in an atom <content> block. I didn't know about the ability to set xml:base though. Do you think we could map the xml:base tag to what I guess would have to be a new baseUrl setter in Html or possibly all the way up to Asset ?

https://github.com/assetgraph/assetgraph/blob/master/lib/assets/Html.js#L40-L51

@papandreou
Copy link

Html is already wired up to get the baseUrl from the superclass' baseUrl getter (and then possibly modify it if there's a <base href=...> in the HTML itself): https://github.com/assetgraph/assetgraph/blob/815ae4b44b30d004bd5fc247cd39c10523cd448e/lib/assets/Html.js#L41-L50

The superclass (Asset) will delegate to its first "non inline ancestor" when it's an inline asset: https://github.com/assetgraph/assetgraph/blob/815ae4b44b30d004bd5fc247cd39c10523cd448e/lib/assets/Asset.js#L556-L559

The challenge seems to be that you can have an xml:base attribute for each inline HTML snippet. I guess the easiest thing is to pick it up when resolving the relation here, add it as a baseUrl property to the to object, and then make sure that the baseUrl getter in Html supports that case also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants