Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download all the files associated with a package from a CDN #235

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Caleb-T-Owens
Copy link
Contributor

@Caleb-T-Owens Caleb-T-Owens commented Jan 24, 2024

This is not yet ready for review

The main focus

The main focus of this PR is to resolve the issue around packages which are comprised of multiple commands. Before this change, importmap-rails only looked for the entrypoint js file and downloaded that. When it was pinning CDN links, this worked just fine because browsers would be able to resolve the relative imports inside the CDN correctly.

Now that we are downloading the entrypoint and it's hosted on our server's Domains, when the browser tries to perform a relative import, it's unable to find the file.

The solution to this problem is to download all the supporting files for each package.

My choice of API

To accomplish this, there were two main options:

  1. Firstly, we could make use of the new download API as suggested by DHH
    Looking into this API it turns out that it was returning far more files than what was necessary. It appears to be providing all the files that are not included in the .npmignore file. (In some of the libraries that looked at, it was including cjs, umd and other non esm module systems)

    In an ideal world, we would only download files that get used by the library.

  2. My second option was to use the staticDeps and dynamicDeps properties provided by the generate API that we are already using.
    These properties only included the files that get imported by the JS libraries and as we already make this API call, it seemed ideal to make use of it.

Update Fri 26/01/2024:
After a conversation with guybedford I've switched over to using the download API. This is because the staticDeps and dynamicDeps can be too restrictive and may not provide enough files for our use case.

Guybedford has said that he will be looking into safely cutting down the number of files provided by the download API so we can hopefully avoid pulling down non esm-related JS files

The new package class

While implementing the new downloads, I found that the download logic was very interwoven with the packager logic and I found what I think is a fairly reasonable division of responsibility, where a package is responsible for resolving the folder names and doing the downloading and removals, and a packager is responsible for manipulating the importmap.rb.

I'd be interested in hearing about people's thoughts on this and would be more than happy to squash it back down into the packager class if we're not ready to take on a new abstraction.

The new JspmApi class

After starting to use the download API from jspm today I noticed a bunch of duplicate code between the packager class and the package class. To reduce this, I centralised the location of the network-related code into one JspmApi class.

OK - so I've got a folder of downloaded JS but my browser is failing to import the modules

The big culprit here is the fingerprinting/digesting done by sprockets and propshaft.

The issue that we run into is how relative imports are handled. When a browser has a relative import, it will resolve the URL relative to the current file. Both sprockets and propshaft however have a fingerprint appended to the end of all JS files when they're served but don't update the import statements in the JS. This results in a situation where we have a file in our filesystem in the right location with the right name, but propshaft refuses to serve it.

Even though it does mean that it's currently not much of an improvement over what we've got; I think that this is something that should not hold back this PR as it's a limitation of two other packages (propshaft and importmaps).

I have made a PR (Related rails/propshaft#181) for propshaft that will transform all the js import and export statements to use the digested URLs

@guybedford
Copy link
Contributor

Feedback very much noted here that loading all files might not be preferable!

I'm going to take another stab at adding some graph metadata to the download API that would then provide a comprehensive package metadata info that could be used to do selective downloading. I think this might be preferable to staticDeps and dynamicDeps in supporting graph traces that might not have been captured during the original import map construction.

If you're interested in trying it out I hope to have something together this weekend.

@Caleb-T-Owens
Copy link
Contributor Author

Caleb-T-Owens commented Jan 25, 2024

Hi @guybedford! In what cases there might be some files that haven't been captured in staticDeps and dynamicDeps?

I think having the download API with graph metadata capabilities would be fantastic (thanks for taking on the feedback!), but it would be quite nice if the same logic to generate a comprehensive list of required dependency files could be applied to staticDeps and dynamicDeps so they can reliably be used for preloading all dependencies as is their suggested use in the docs, as well as meaning that we don't need to make two API calls.

@Caleb-T-Owens
Copy link
Contributor Author

I've taken a look into sprockets to see if I could make a similar change to what I did for Propshaft (rails/propshaft#181), but looking into the codebase, I've not been able to identify where I would go about doing such a thing (if there is a place that it would fit in sprockets at all!).

Given that Rails 8 will be moving to propshaft anyway it might not be a concern, but if anyone does know of a simple way to achieve a similar change, please let me know!

@guybedford
Copy link
Contributor

@Caleb-T-Owens to explain the difference - when preloading you only care about the exact graph that was traced - that means only those modules from the package that are used are traced, and only under those environment conditions that are set. On the other hand, a full trace of a package that works for different environment conditions and for different internal modules being imported is a slightly larger list, but still a smaller list than the full file list of the package. The benefit of using the latter for the download API is that the downloaded package folder can be cached across uses.

@Caleb-T-Owens
Copy link
Contributor Author

@guybedford Thanks for explaining. I'll move over to the download API tomorrow in anticipation of this option

@Caleb-T-Owens
Copy link
Contributor Author

Update! I've moved over to the download API

@guybedford
Copy link
Contributor

As discussed I implemented a new exclude option for the JSPM download API in https://jspm.org/cdn/api#download. Setting exclude=unused,readme,types should reduce the file count considerably. Further feedback very much welcome.

@Caleb-T-Owens
Copy link
Contributor Author

@guybedford Hi! Thanks for being so fast with that update. It does seem that exclude is now a required field which is probably undesirable; IE: GET https://api.jspm.io/download/@popperjs/[email protected] returns {"error":"\"exclude\" must be a list"}.

Something that would be nice but needed, is if I could POST directly to https://api.jspm.io/download and provide the versioned packages as an array in the body, rather than appending things onto the end of the URL.

@guybedford
Copy link
Contributor

@Caleb-T-Owens thanks for the quick feedback too, both changes are added now.

@guybedford
Copy link
Contributor

guybedford commented Jan 28, 2024

@Caleb-T-Owens if you're still finding there's too many files I'd be interested to hear that too - we could possibly also add an env option to this exclude feature as well, which would filter to the needed conditional environments of the package (eg perhaps you don't want to download development condition files when you're just downloading the package for production). If that would be useful do let me know.

(The important difference from normal generator env here being that normally we only trace a single environment, whereas this env filter can trace multiple environments at the same time)

@Caleb-T-Owens
Copy link
Contributor Author

@guybedford I think I could do with some more context about what the best env option is for generate.

At the minute, we're using env: ["browser", "module"], which in most cases, only needs the one built file (which has been built in the package developers build step). For example, with alpine.js, we are provided with the entrypoint https://ga.jspm.io/npm:[email protected]/dist/module.esm.js and we don't need to look anywhere else.

When I use the download API for alpine.js, we get quite a bit of extra stuff provided. Seemingly, these are all the files that have been built out of your RollupJS and other steps, however when trying different envs, I can't seem to give me the entrypoint of those files, and instead was giving me https://ga.jspm.io/npm:[email protected]/dist/module.cjs.js

Is there a better env setting that we should be using for development/production to get those particular entry points?


I guess what I'm trying to wrap my head around is; if we're downloading all these extra files and adding them to our version control, we need a good justification for having them, and I can see having the ability to make a second importmap (or altering our pin function to take a "dev" argument) which makes use of the development entrypoints being quite a good justification, but I've not been able to find an "env" that would give me those development entrypoints.

@guybedford
Copy link
Contributor

guybedford commented Feb 12, 2024

Sorry for the delay here @Caleb-T-Owens - moved apartments and had two work trips. So for the alpine.js case - this package does not have a package.json "exports" field. So all module files are potentially loadable via subpathing. This is because the rule of the "exports" field is it provides the only subpaths accessible in the package - it is encapsulating. So packages without this field have weaker guarantees about what code is public and what code is private.

When JSPM finds a package without an "exports" field, it falls back to using its global code analysis for all consumers of that package over all code on npm. And for alpine.js it turns out that users very much do use all exports of it - there are libraries on npm doing imports like require('alpinejs/src/lifecycle'). So that is why JSPM has added them into the "exports".

For JSPM, we could add an override to its package.json for JSPM like "exports": { ".": "./dist/module.cjs.js" }, which would then resolve this back to just being one file.

Alternatively we could support a download variant that only downloads the main entry point for a package instead of subpaths - but packages that do rely on other subpathing would then break when used against that mechanic (and subpaths are actually a best practice for breaking up libraries into smaller chunks of code).

Hope that explains the situation a bit better? I'm open to download API adjustments, and I'd also be more than happy to guide you through the override process if you'd like to override the alpinejs configuration here specifically.

@Caleb-T-Owens
Copy link
Contributor Author

@guybedford Thanks for the explanation, that makes sense. I don't think we'd want to cut it down more and potentially break other packages

miharekar added a commit to miharekar/visualizer that referenced this pull request Mar 30, 2024
Skypack seems abandoned:
- skypackjs/skypack-cdn#365
- skypackjs/skypack-cdn#362

importmap-rails switched logic to always download: rails/importmap-rails#217

But it doesn't work for complex bigger packages. There's a WIP PR to address this: rails/importmap-rails#235

So I decided to use the new rails-importmap approach where I can, and switch to jsdelivr where I have to.

I've pinned to major versions and jsdelivr should take care of everything else. I've also updated the rake task to check for major version updates.
@mhenrixon
Copy link

@Caleb-T-Owens, @guybedford what do we need to be able to finish this? It keeps biting me in the butt and since work was done here, I figured it is best to use this issue.

Always happy to pair on the problem.

@guybedford
Copy link
Contributor

@mhenrixon the download API for JSPM is public and stable at https://jspm.org/cdn/api#download. And for cases like the package previously discussed, it is always possible to explicitly define the "exports" to reduce the file count or provide a JSPM-specific package override. I don't have the bandwidth to work directly on the integration side here, but do just let me know if I can help further at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants