-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Improve validation for search paths #12376
Conversation
|
I think it would ease reviewing if the validation change and the structure change aren't in the same PR. Or is there a specific reason why you put both changes into one PR? |
return Err(SearchPathValidationError::NotADirectory(typeshed)); | ||
} | ||
let stdlib = typeshed.join("stdlib"); | ||
if !system.is_directory(&stdlib) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the is_directory
reads will be a problem because they by-pass Salsa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought search-path settings were set once at the start of the process, and then never changed after that. Given that, is Salsa invalidation necessary here? If the settings never change, then the settings will only ever be validated once, so this function will only ever be executed once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to watch these paths and if something we are assuming is a directory becomes not-a-directory, we need to react to that -- but I'd also assumed that this type of change (which should be rare) would mean totally starting over, which seems like it means Salsa invalidation wouldn't be relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watching the directories wouldn't be sufficient. We would have to watch the parent directories because a change from file to directory is technically a different file (watching is file and not path-based). But that would significantly increase the number of files we watch, and some systems limit the number of files that can be watched. That's why I think that we don't want to do that.
There's also the problem that we can't watch the folders if resolving the module resolution failed. At least not without duplicating a lot of this logic in the code where we set-up file watching. That's why I'm actually not sure if we want to error in the first place or if it would be better fallback to reasonable defaults. Because failing to start means that the LSP can support no-completion whatsoever. It also can't provide any help fixing the ruff settings. It's just dead.
I thought search-path settings were set once at the start of the process, and then never changed after that.
This is correct today but not long-term. A user can change the workspace settings at any time, and we have to respond to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought search-path settings were set once at the start of the process, and then never changed after that.
This is correct today but not long-term. A user can change the workspace settings at any time, and we have to respond to that.
I see -- this is different to what I had previously understood
I started off only doing the validation changes, but then I realised that really there were different invariants for module-search paths as opposed to module-resolution paths in general, and so improving the validation didn't really make sense while they were still being represented by one type. But I can certainly split it into two PRs. |
6dc2e2e
to
9275b10
Compare
I've extracted the structure change into its own PR: #12379. |
9275b10
to
05d2d42
Compare
05d2d42
to
497c4f0
Compare
Following the changes made in 2a8f95c (which this PR is now rebased on top of), this PR is now solely focussed on improving validation for search-path settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but will leave to Micha to stamp once the question of Salsa invalidation is resolved.
return Err(SearchPathValidationError::NotADirectory(typeshed)); | ||
} | ||
let stdlib = typeshed.join("stdlib"); | ||
if !system.is_directory(&stdlib) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to watch these paths and if something we are assuming is a directory becomes not-a-directory, we need to react to that -- but I'd also assumed that this type of change (which should be rare) would mean totally starting over, which seems like it means Salsa invalidation wouldn't be relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the validation code but I don't think the "all settings must be valid or we don't use any of the provided settings" provides the best user experience. It means the CLI will fail to start and so will the LSP. This severely limits the LSP because it can't provide any help when editing the settings.
I think a better approach would be to recover as many user settings as possible. This gives us the closest to what the user wanted. We can surface the error messages with tracing::warn
or tracing::error
for now.
return Err(SearchPathValidationError::NotADirectory(typeshed)); | ||
} | ||
let stdlib = typeshed.join("stdlib"); | ||
if !system.is_directory(&stdlib) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watching the directories wouldn't be sufficient. We would have to watch the parent directories because a change from file to directory is technically a different file (watching is file and not path-based). But that would significantly increase the number of files we watch, and some systems limit the number of files that can be watched. That's why I think that we don't want to do that.
There's also the problem that we can't watch the folders if resolving the module resolution failed. At least not without duplicating a lot of this logic in the code where we set-up file watching. That's why I'm actually not sure if we want to error in the first place or if it would be better fallback to reasonable defaults. Because failing to start means that the LSP can support no-completion whatsoever. It also can't provide any help fixing the ruff settings. It's just dead.
I thought search-path settings were set once at the start of the process, and then never changed after that.
This is correct today but not long-term. A user can change the workspace settings at any time, and we have to respond to that.
I think this is somewhere where we might want different behaviour depending on whether redknot is being run as a CLI tool or as an LSP server. As an LSP server, I can see that it makes sense to try to recover as much as possible, as the user may be in the middle of editing their workspace settings -- and even if they're not, it's annoying if redknot completely gives up from linting all your code because of a small typo in your workspace settings. As a CLI tool, I think the tradeoff is pretty different. Something I've really liked about Ruff in the past is how strictly it validates the settings passed to it. It means that there's no ambiguity, and no chance of Ruff silently doing the wrong thing. This is especially important, in my view, for a tool that applies autofixes to your code -- moreover, it's a tool that many people have installed as a pre-commit hook, which means it'll automatically apply autofixes as part of every commit. |
I agree on that assessment. But that still means that the infrastructure has to handle it gracefully for the LSP case. The CLI will want to call some function that tells it if the settings are correct. |
I don't have error handling figured out entirely at this point, but I generally see two options, and I'm leaning towards option 1 for module resolution. But it's not entirely clear how we would implement it for module resolution specifically. Handle possible fatal errors outside of salsaWhat I mean by that, is that we avoid handling errors that are potentially fatal inside of Salsa queries and instead force the CLI/LSP to handle them when creating the Salsa inputs. Specifically to module resolution, this would mean that we validate the search paths before setting them on This has two benefits:
I think I want to use a similar design for handling errors related to the configuration. The main challenge I see with this, specifically for module resolution, is that Gracefully handle the errors inside SalsaThis is the approach that I want to use for any non-fatal errors, e.g. we fail to read a single file. We extend We could do the same for module resolution but it does have the downside that forgetting to check if the operation failed is silent and, therefore, easy. |
I agree there's lots of design work still to be done here regarding error handling and settings resolution. It wasn't really my intent to solve that problem in this PR; I just really wanted to make the validation we were already doing make some kind of sense, and to add some proper error messages for it rather than simply having the module resolver call @MichaReiser, what's your specific concern with the changes this PR is making? It's already the case on I can think of lots of possible ideas for solving the bigger questions you're posing here, but I'm not sure any of them are in the scope of this PR. |
My main concern is that it isn't a clear improvement to me because it introduces new unobserved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't think these are reasons enough to not land this PR
Summary
This PR cleans up the settings validation and types in the module resolver. Namely:
ModuleResolutionPathBuf
has either no extension, a.pyi
extension or a.py
extension. But for search paths specifically (which are the only module-resolution paths that are ever directly constructed), that's the wrong kind of validation for us to do: we should just validate that the path points to a directory.<typeshed>/stdlib/
directory exists, that a<typeshed>/stdlib/VERSIONS
file exists, and that theVERSIONS
file parses.Test Plan
cargo test -p red-knot_module_resolver
. I haven't added any new tests specifically as part of this PR, but I'm happy to if you'd find them useful, either as this PR or as a followup.