-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImportCache of CanonicalizeResult is broken for relative imports handled by any importer that is not a baseImporter #2208
Comments
After reviewing the code, I think the issue is in the API design that we are expecting a single dart-sass/lib/src/import_cache.dart Lines 156 to 174 in 783c248
In these lines, the I can see two issues here:
In my opinion, I think the idea of |
Relative paths are resolved using the importer that loaded the current module (this is what the baseImporter is, which is why there is only one). When going through other importers, URLs are not meant to be considered relative ones. To me, the test you added in sass-spec does not respect the rules than were defined in the spec for importers (and I think it is impossible to define the same importer in the Dart API). @nex3 reading the changelog in https://github.com/sass/dart-sass/releases/tag/1.68.0, it looks like filesystem importers have a special behavior in the JS API which has no equivalent in the Dart API. Is it expected ? It looks like this is the root cause of this caching issue because it allows JS filesystem importers to resolve the same URL to different canonical URLs. |
This is exactly where the bug is. We are passing relative urls to The fundamental issue I see is the concept of resolving relative url and canonicalizing a url is not necessarily the same: a resolved url from base url is only meant to be an absolute url, and not necessarily a canonical url. This makes it utterly confusing to ”canonicalize” a relative url with three different patterns in the current API:
In my experience, these mixed patterns make it extremely difficult to write generic custom importer or importers that correctly handles all the cases (and it’s not well documented). As a user of the API, I end up finding out that not relying on the base importer which may or may not resolve the url, and only using importers to handle both url resolution and canonization, is much more flexible, easier and consistent.
From thread I’ve read, it appears that the initial design is exactly as you described, that the base importer is meant to take a resolved url, and user can further canonicalize a resolved url, and then importers are meant to handle any unresolved url, which is “assumed” to be absolute, but not enforced. Unfortunately, the community missed the initial RFC and when the API is already out for a long time, webpack team comes and say they need the ability to customize the “resolving” of the url, rather than just given an already resolved url to canonicalize. This was accepted and that’s why we have containingUrl. This change means the previous restrictions in spec, if any, should have been lifted, that what I’m doing (customizing the resolution of relative url) is completely legal as long as my implementation is deterministic. One of the major design objectives of new module system and new API is deterministic resolution and canonization. The cache in question is based on the assumption of that importers are deterministic and always returns the canonicalized absolute url with the same input. For any single importer it should be deterministic given the same |
Sorry to chime in a bit late here, I was out sick for a chunk of last week. I do think there are bugs here, but I don't think the fix in #2209 is exactly correct. Let me give some background that may help explain the way things work today. History LessonThe Old DaysWhen we first added support for importers to Sass, we didn't have a rigorous understanding or specification of exactly what the strings after an Given that, all In this system, we still wanted to enforce the idea that relative loads always take precedence over loads from the load path. But because we weren't officially working with URLs, the only way we were able to do so was by fiat. We just had a separate Dawn of Dart SassWhen I was implementing Dart Sass, especially knowing that But we still had to deal with the fact that all the existing importers (and users!) thought in load-path terms, where they just used relative URLs to load from their importers. We couldn't just say "all URLs going into importers are absolute from the start". And we also needed to ensure that relative loads would take precedence over these load-path-style loads. I think the solution was rather clever: because all loads were now represented as URLs anyway, we'd represent relative paths as URLs that were absolute but not canonical, based on taking the relative URL that appeared in the file and resolving it based on that file's canonical URL. Then we could safely pass a truly relative URL to Contextual ComplicationsThings get more complex from here. In order to support Node-style package resolution, we had to add a means for even absolute loads to see what file they're being loaded from. While it's true that the WebKit loader folks were the first to ask for this, it's also strictly necessary for the Node package importer, so we'd have wanted to add it sooner or later anyway. Node.js's package resolution algorithm is unavoidably sensitive to where the load occurs, because it needs to use the most local To accommodate this without going fully back to the old Ruby Sass days of fully relying on importer implementors to correctly follow all the rules, we added a way for importers to "opt in" to getting the contextual information: they would indicate which absolute URL schemes they would only accept as non-canonical absolute URLs and never return from This is where the second bug snuck in, though. Because in the old system canonicalization was expected to be purely dependent on its input arguments, we had started caching it to avoid extra filesystem hits for commonly-loaded stylesheets. But now, in some circumstances, canonicalization could depend on the containing URL, and we didn't include that in the cache. This is the specific bug that's highlighted by sass/sass-spec#1969. How Should We Fix It?I think the critical element of the fix here is pretty simple: don't cache The more complex answer would be to stop caching for a given importer once that importer accesses In the short term, since this does produce visible behavior bugs, I'm going to fix those using the broader disable. Then we can look into adding caching back with appropriate cache-busting later on. |
When I use compileString API with no "url" defined for string input, a "relative" import from the root stylesheet would go into pattern 2, with a custom base importer. For example: sass.compileString('@import "a";', {
importer: {
canonicalize (url, _context) {
console.log(`// canonical(${url})`)
return new URL('u:a')
},
load (_url) {
return {
contents: 'a {b: c}',
syntax: 'scss'
}
}
},
}) |
My initial solution was to effectively have a per importer canonicalization cache whenever baseUrl (containingUrl) is passed, regardless of whether it is used, but only for relative url that has no scheme. Performance wise it is going to be slower but better than no cache. I understand this is not necessarily correct. For example, if somehow absolute urls are canonicalized differently depends on different baseUrl, this can very well still be broken (IMO if this really happens it's the importer's fault, that's why I did not consider this case). However, if we expand this idea to both relative and absolute urls to always use a cache key of @nex3 What do you think? I updated the PR to implement this. |
This is an invalid invocation per the TypeScript types: That said, it looks like the Dart API and the embedded protocol both do allow this pattern. Per spec, this would end up invoking the WHATWG URL parser with a relative URL string and an undefined base URL, which would return a failure. We don't do that currently—we just treat this as a load-path-style canonicalization for the base importer, which doesn't really make sense. We should issue a deprecation for this behavior and clarify that the
This is intentional and desirable behavior: for example, in the Node.js package importer, absolute
An important goal of caching canonicalization is to avoid the repeated process of querying each importer in the chain for URLs we already know how to load. You're right that we could avoid the issue by just building all the possible information into the cache key, but then we'll only get cache hits if we're reloading the same URL from the same source file—something that's only going to happen in I have an in-progress PR which should address these issues. |
Having spent the day working through all the consequences of deprecating Pattern 2, I'm not quite sure that it's correct to remove it. We do currently use it to represent situations like parsing stdin on the CLI and allowing the entrypoint to load files from the working directory without making that available as a load path everywhere. Using a load-path-style importer for this is a bit of a hack, but any alternative—like creating a fake URL for the file—is also a bit of a hack, and has knock-on consequences like not being able to use All that is separate from the caching issue, which is still definitely a bug and essentially independent of the question of how the base importer is handled (since the base importer never has access to the containing URL anyway). |
I agree that adding an While this do require additional infrastructure, it should be pretty straight forward to implement. As we already have @nex3 Let me know if you want to take it, or I'm happy to contribute some time on it. I took a brief look and it is indeed a bit complicated, because in the native dart code containingUrl is implemented as |
Yeah, after sleeping on this, I think leaving the existing behavior as-is for Pattern 2 probably makes the most sense. The most accurate representation of the underlying logic here would be to have a mandatory base URL and a flag indicating whether that's also the canonical URL, but I think that's more complicated than it's worth for what is ultimately an edge case—as is going through a full deprecation process to move to a solution that's also not ideal. @ntkme I agree that we don't need to cache-bust the entire importer on a |
See sass/dart-sass#2208 Co-authored-by: Carlos (Goodwine) <[email protected]>
One note we might want to add to the change log is that the
In real world, the first case is likely pretty rare. However, the second case can be somewhat common as a performance optimization to 1) avoid compute the same result twice in two different importers; and 2) avoid reading large raw files on embedded host and then send via protobuf to compiler. - In fact, this optimization can also be applied to the legacy importer in embedded-host-node. We will never know if a user's importer is stateful or stateless. Most of them are likely stateless, but there might be some stateful Importers getting impacted by this change, so I think it's worth to document how caching for canonicalization works and how could it impact a stateful importer. |
Importers are really expected not to be stateful, so I'm not concerned about subtle behavioral changes for them. It's fine to have a few exceptions for backwards-compatibility purposes with old bad importer APIs, but non-tooling authors should never do that. |
Closing this as we decided to abandon the deprecation and cache optimizations has been delivered in |
Deprecate importer without base URL (*ImportCache.canonicalize: Deprecate base importer without URL #2213)containingUrl
(Explicitly allow a base importer without a base URL sass#3831)containingUrl
and cache the rest (Implement access tracking for containingUrl #2220) (Add a per-importer cache for loads that aren't cacheable en masse #2219)Drop support for importer without base URLSee https://github.com/sass/sass-spec/pull/1969/files for reproduction of the issue.
The text was updated successfully, but these errors were encountered: