Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify the usage of path parameter in functions #977

Closed
Lesik opened this issue Mar 26, 2020 · 40 comments
Closed

Unify the usage of path parameter in functions #977

Lesik opened this issue Mar 26, 2020 · 40 comments

Comments

@Lesik
Copy link

Lesik commented Mar 26, 2020

The path parameter is used in multiple functions that Zola adds to the Tera templating engine, like get_url() and resize_image(), as well as in Markdown links like [link](path). However, the way that they resolve the path seems to differ between them.

For example, get_url() seems to do a lookup in static/ for paths that have no leading / or @, and for paths that start with an @ it does a lookup in content/. On the other hand, both get_page() and get_section() start in content/ for paths without leading / or @. It is unclear from the documentation what happens with paths with a leading / in any of these functions.

function mypath @mypath /mypath
[link](mypath) $PWD/mypath content/mypath ?
get_url() static/mypath content/mypath ?
get_page() content/mypath ? ?
get_section() content/mypath ? ?
resize_image() content/mypath ? ?

I propose the unification of the path parameter in some or all of those cases. Let's discuss!

@Lesik
Copy link
Author

Lesik commented Mar 26, 2020

By the way, I've stumbled upon this while creating a shortcode that resizes images. I wanted to link to the original image as well as the resized one, so both get_url() and resize_image() came into play. But since the path parameters are resolved differently, I couldn't just pass the same path to both functions, which I thought was pretty confusing.

@Keats
Copy link
Collaborator

Keats commented Mar 26, 2020

Related: #911

@Keats
Copy link
Collaborator

Keats commented Jun 9, 2020

Can anyone spend some time to figure a nice way to unify them? No need to actually implement it, just think things through

#1044 took the approach of actually allowing basic path and @ path.

@vojtechkral
Copy link
Contributor

vojtechkral commented Jun 18, 2020

I ran into this issue - and #788 - during refactoring of the imageproc code. I'm not yet sure how to solve this but I would like to make resize_image() consistent with the others as part of the refactor... I'll try to think of something...

@Keats
Copy link
Collaborator

Keats commented Jun 19, 2020

Yep, hopefully the next version will have fully consistent paths, I'm thinking about it too.

@Lesik
Copy link
Author

Lesik commented Jun 19, 2020

So I think the easiest one to tackle is resize_image(), as its purpose and use are pretty clear while the other functions can be used for all sorts of things. In my opinion, resize_image()'s path parameter should work exactly like markdown's ![alt](path), since both are for referencing images and that way it's more intuitive. So for paths starting with a normal letter (like team.jpg) it should do a lookup under content/$current_path$/team.jpg. However, we need some way to reference images in static/. For this we could use the / prefix, while a little confusing, it makes sense when you think about that all content in static/ will be on / after running zola.

content
└── about
    ├── imprint.png
    ├── index.md
    └── team.jpg

I think the same should apply for get_section() except that / looks under content/ instead of static/.

@vojtechkral
Copy link
Contributor

vojtechkral commented Jun 19, 2020

So for paths starting with a normal letter (like team.jpg) it should do a lookup under content/$current_path$/team.jpg. However, we need some way to reference images in static/. For this we could use the / prefix, while a little confusing, it makes sense when you think about that all content in static/ will be on / after running zola.

Ha, incidentally, I already have this implemented in my local refactor :-) However, I'm not entirely sure this won't make matters more confusing. Also, there's still the issue with #788 which I'm not sure how to solve such that the overall situation is consistent.

@vojtechkral
Copy link
Contributor

Also, and this is a bit of a nit, the @path is actually @/path if I'm not mistaken...

@Lesik
Copy link
Author

Lesik commented Jun 19, 2020

Afterwards, we should also address whether or not any or all of these functions should error out if the resource is not found.

@Keats
Copy link
Collaborator

Keats commented Jun 19, 2020

Another issue with paths: #877 but in that case from a function output

@Keats
Copy link
Collaborator

Keats commented Jun 19, 2020

And #1035 is roughly the same

@vojtechkral
Copy link
Contributor

#1035 is a dupe of #788 IMO

@Keats
Copy link
Collaborator

Keats commented Jun 27, 2020

Another type of paths that could be added once we resolve the existing ones: https://zola.discourse.group/t/experience-report-after-porting-my-blog/478/3?u=keats

Ie a shortcut for when you have a lot of nesting and unique filenames. Not for the next version though, let's make sure that

  1. it makes sense
  2. every paths are consistent first

@Keats
Copy link
Collaborator

Keats commented Jul 24, 2020

In #1086 I've decided that current_path/page.path/section.path etc will have a leading /, so if you do window.location.pathname or something it matches. Pretty much every server interpretation of a path has a leading / as well.

Another issue is that some people want to get the URL for nested assets (#1098) but I am not sure about that.

I don't think I want to force people to use leading / in functions though so I think with/without leading / should be the same as it would be confusing to be different imo.
How about $/ to refer to static folder? It's not that common so it makes sense to add a sigil imo.
get_url is tricky as people would expect path without @/ to get the correct url for markdown files if we move it to PWD...

@radio-alice
Copy link

Hey y'all no idea if this is related, but I'm building an rss reader and noticing that Zola-generated rss.xml files use relative links for images, meaning that they can't be viewed in the reader without me overriding the post's html. I assume the issue stems from this one, so just wanted to chime in to push for defaulting to absolute urls for (processed?) images in rss templates. Sorry if this is the wrong place to report this! Also sorry if this has been fixed, and the feeds I'm looking at were generated by an older version– I couldn't get zola to build an rss file at all, so couldn't check if it's still happening. Feel free to delete if totally irrelevant.

@Keats
Copy link
Collaborator

Keats commented Jul 27, 2020

At least on my site (built with 0.11, https://www.vincentprouillet.com/rss.xml) some images are absolute and some are relative. I will check why later, that's a bit odd.

@southerntofu
Copy link
Contributor

mypath and /mypath are HTML standard for relative URL, respectively to the current page and to the webroot. These are established standards and i believe we should not try to change their meaning.

From what i understand @/mypath was implemented to reference Markdown pages from the content directory. But it just so happens it produces links relative to the website's root, which is very useful as it may be distinct from the webroot.

Relative links to the website's root is also what we want when referencing stuff in the static folder. Couldn't we use this symbol for both cases, and error when an @/link points to a missing ressource (either in content/ or static/) ?

@Keats
Copy link
Collaborator

Keats commented Jul 27, 2020

Ok so in my blog, if I use markdown images it gets the full permalink but I was using a shortcode for clickable images where I was using relative links, I've just changed it and now all my image links are absolute.

@southerntofu can you expand? Do you mean only allowing @/ or still allowing mypath and /mypath?
I think in all cases, the start of the path should be the website root, and not static like get_url (which really should have been called get_url_for_static since that was the goal and is now confusing)

@southerntofu
Copy link
Contributor

@Keats I don't think "/" should reference the site's webroot, otherwise how do you reference the actual virutalhost's webroot? In my view, starting from the site's webroot is what the @/ syntax was introduced for, because reusing / for this would contradict how HTML links work.

In my view, we could have something like this in the context of https://WEBROOT/SITEROOT:

  • path is a link for relative to the current page (like in HTML)
  • /path is an absolute link for https://WEBROOT/path (like in HTML)
  • @/path is an internal link for https://WEBROOT/SITEROOT/path, which can correspond to a piece of content, a static file, or a CSS file generated from SASS (am i forgetting something?)

Two caveats:

  • pages residing in the same folder as a section cannot don't have access to the same assets, whereas relative links in HTML work on a per-folder basis (counter-intuitive for people from a web background)
  • relative and internal links should be localized when possible, so that translating an asset like an image (or anything else, really) only requires adding a corresponding asset.LANG.ext

For the first problem, i had written a simple patch which was not merged for good reasons: assets should not be copied many times in many output folders. An alternative way is by generating symlinks instead of copying the files: this would make me happy, but would not be portable to all platforms. Another way is in the path resolution process (in markdown.rs), but then it means we need to expose assets/parenthood information to this rendering process somehow? Yet another way is #840, by having pages exported to the same folder as the section, all pages in a section can link to the same assets, but this does not solve the problem for people who do not want "ugly URLs".

For the second problem, there is one way i mentioned before for templates by exposing a has_page function which enables a theme author to make macros to deal with localization of content in a flexible manner. However, this approach would not enable localization of Markdown links (except through shortcodes), and places the burden of translations support on the theme authors.

Maybe unified path resolution could address both caveats? However, a concern is what to do when there is a missing piece of content or translation? Currently, zola will not complain for "wrong" relative URLs in a page. That's good on one hand because assets can be dynamically generated outside of zola's reach, placed in the right folder, and will become available after site generation (that's how relative URLs work in HTML). On the other hand it's bad because it does not catch typos in assets names (unless you use a full internal link).

Maybe different strategies in how to deal with failed resolution apply to different situations? If we unify path resolution, we'll make a lot of things easier, but we still need control over whether a missing translation or a missing piece of content should produce an error, a warning, and/or have a fallback.

For markdown resolution, this could be inherited from a section and be overwritten on a per-page basis. Something like missing_content_error (error, warn, none) and missing_translation_error (error, warn, none). In the templates, maybe all functions dealing with path could have similar parameters, defaulting to parameters from site config?

Additionally, if templates can silently detect failed resolution, it can introduce new usecases like:

  • displaying "missing translation" messages with links to where a user can contribute translations for their language for specific pieces of content (maybe a new banner?)
  • displaying "missing data" messages where we use load_data, with instructions how to generate the data (could be useful for a test result page like here)

What do you think?

@Keats
Copy link
Collaborator

Keats commented Jan 14, 2021

I think it makes a lot of sense, but I need to think more about the i18n cases and it can be a bit separated from this issue.

I think the main issue here is get_url but I don't know if the path to an asset in static should start with @/ because it refers to link to content only in markdown (where we still use relative links for static assets). I haven't given much thoughts to this issue recently, I'll need to spend some time thinking about it.

@Keats
Copy link
Collaborator

Keats commented May 10, 2021

Ok time to actually work on that so it finally makes sense and to handle i18n better. How does that look to everyone?

function mypath @/mypath /mypath
[link](mypath) $PWD/mypath content/mypath $BASE_URL/mypath
get_url() [$PWD/mypath, content/mypath] content/mypath $BASE_URL/mypath
get_file_hash() [$PWD/mypath, content/mypath] content/mypath N/A
resize_image() [$PWD/mypath, content/mypath] content/mypath N/A
get_image_metadata() [$PWD/mypath, content/mypath] content/mypath N/A
load_data() [$PWD/mypath, content/mypath] content/mypath N/A
get_page() content/mypath content/mypath N/A
get_section() content/mypath content/mypath N/A

The idea is that it should be easy to:

Next steps are to write some tests for every issues mentioned and see if the above would solve them.
get_page/get_section can only be used to refer to something in content/ so whether or not we are putting the @ we would still look in content. This might make some things easier when building paths in templates.
It is a breaking change for get_url as people will need to do static/my.js instead of my.js and it should also work on colocated assets somehow.

[$PWD/mypath, content/mypath] means we are actually checking from both places in that order.

I'm also a bit unsure on where to allow /mypath.

Any thoughts?

@Keats Keats pinned this issue May 11, 2021
@Keats
Copy link
Collaborator

Keats commented May 12, 2021

I've started work on #1455

@Keats
Copy link
Collaborator

Keats commented May 17, 2021

The state of #1455 which I will merge soon:

  • more tests (🎊 )
  • some functions have been marked as safe (eg, no need anymore to do get_url() | safe, just get_url() will output the same string)
  • resize_image now returns both the new image URL as well as its path in the static directory, allowing it to be chained with get_image_metadata

Now for the paths themselves, all the functions working on files (get_url, get_file_hash, resize_image, get_image_metadata, load_data) will not use the following logic in this order:

/// 1. base_path + path
/// 2. base_path + static + path
/// 3. base_path + content + path
/// A path starting with @/ will replace it with `content/`

In the PR state, this means that pretty much anything starting with / will not work.

A few questions remain:

  1. page.path and section.path have a leading / so just concatening things to it would fail. Until now, get_url was stripping leading / from the given path, should I add a rule doing that in all functions?
  2. get_url will still end up doing a best guess for some things: for example calling it in the index with a path to a colocated asset will give you back an invalid link since it doesn't look around to check if it's a colocated asset. I could add that
    but is it worth it?
  3. Calling get_image_metadata on a resized image will fail the first time, since images are processed after the templates. I've added a allow_missing argument that a user can use to ignore file not found as a stop gap solution - I'm not sure what is the best thing to do there. cc @vojtechkral

What do people think?

@vojtechkral
Copy link
Contributor

vojtechkral commented May 17, 2021

  1. Calling get_image_metadata on a resized image will fail the first time, since images are processed after the templates. I've added a allow_missing argument that a user can use to ignore file not found as a stop gap solution - I'm not sure what is the best thing to do there. cc @vojtechkral

Good question, don't know exactly what would be best right now either. I've pulled your branch and I'm having a look. Will try to figure out something...

@vojtechkral
Copy link
Contributor

vojtechkral commented May 17, 2021

@Keats How about we make both resize_image() and get_image_metadata() return the same kind of object: {url, static_path, width, height}. resize_image() Can load the metadata right away just like get_image_metadata() and compute the dimensions, this shouldn't slow it down too much. The actual resize can still be done later in a deduped and parallel fashion. Does that sound ok?

Edit: So many typos, sorry

Edit2: I am willing to implement this, I've been procrastinating on the imageproc cleanup long enough and I've just finished some other stuff, so I should be able to get around to it finally...

@Keats
Copy link
Collaborator

Keats commented May 17, 2021

That's a good idea! If you can do the PR on #1455 it would be great

Edit: wait, how would do know the sizes with something like a fit where you only pass the width?

@vojtechkral
Copy link
Contributor

vojtechkral commented May 17, 2021

Edit: wait, how would do know the sizes with something like a fit where you only pass the width?

I think that should be computable from the input width and image metadata...
Edit: Actually, having the same type for resize_image() and get_image_metadata() doesn't help, it's quite different, but the idea of computing the size beforehand is ok I think.

@vojtechkral
Copy link
Contributor

vojtechkral commented May 20, 2021

@Keats FWIW, I'm working on it, about half way through, will try to report back asap with something...

Edit: WIP commits in my paths-vojtech branch. Will yet need to test a bit more, clean it up a bit, add docs etc.

@Keats
Copy link
Collaborator

Keats commented May 21, 2021

Cheers, take your time we're not in a hurry. What do you think of the other functions change? Does it make more sense now?

@vojtechkral
Copy link
Contributor

vojtechkral commented May 23, 2021

@Keats I've created a PR at #1484. The changes aren't documented yet, I suppose I would update the doc after we figure what the end result is going to be.

I think overall the change makes sense as far as I understand. Does the $PWD in the table refer to the directory of the current MD or HTML file? This isn't yet implemented, right? Or maybe I misunderstand...

@Keats
Copy link
Collaborator

Keats commented May 24, 2021

Look at the comment in #977 (comment) rather than the table it should be clearer. I'll have a look at the PR

@adworacz
Copy link

Does this new implementation address the use case presented in #1161 around get_image_metadata and colocated assets?

Specifically, being able use a shortcode and use it in Markdown like so:

Markdown - content/example/index.md:

{{ image(src="image-example.png") }}

Template - templates/shorcodes/image.html:

{% set meta = get_image_metadata(path=src) %}
<img src="{{src | safe}}" width="{{meta.width}}" height="{{meta.height}}" />

Or do we still need to use the page.path variable and strip the leading slash?

@Keats
Copy link
Collaborator

Keats commented May 30, 2021

do we still need to use the page.path variable and strip the leading slash?

You will still need to use the page.path variable but I will make it so you don't have to strip the leading slash.

@Keats
Copy link
Collaborator

Keats commented Jun 10, 2021

It's all available in the next branch now, thanks to @vojtechkral for the improvements on the image handling. All functions operating on files now use the same helper function to get the path so it can't be more unified than that!

@Keats
Copy link
Collaborator

Keats commented Jun 11, 2021

@vojtechkral any reason we are not caching get_image_metadata response? I can add that if it's ok

@vojtechkral
Copy link
Contributor

@Keats Missed the notification, sry. Anyway, no reason, agreed it would be a good idea to cache it. Not sure if I can get to it sooner than next weekend :/

@Keats
Copy link
Collaborator

Keats commented Jun 14, 2021

Ah I didn't mean you need to be the one to do it, I can do it myself! Just wanted to double-check it makes sense

@tusamni
Copy link

tusamni commented Jul 7, 2021

Any idea when the next branch will be merged?

I'm trying to access images in the static folder using get_image_metadata and struggling to figure out the correct path.

@Keats
Copy link
Collaborator

Keats commented Jul 7, 2021

Maybe next week? Just trying to find bugs and fix them before release. You can try the next branch and report if everything works for you there!

@Keats
Copy link
Collaborator

Keats commented Jul 19, 2021

Should be fixed in 0.14 which is now released! Thanks all for the feedback and please open issues if you encouter bugs or weird things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants