-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add case insensitive comparison, besides Levenstein for DYM #46347
Add case insensitive comparison, besides Levenstein for DYM #46347
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @pnkfelix (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
src/test/ui/issue46332.rs
Outdated
@@ -0,0 +1,14 @@ | |||
// Original problem is Levenstein distance. | |||
|
|||
fn TyUint() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(An enum variant rather than a function is closer to the spirit of the example in the issue (which isn't to say that we shouldn't test functions, too, but those are snake_case_in_standard_style
, unlike TyUint
).)
src/test/ui/issue46332.rs
Outdated
@@ -0,0 +1,14 @@ | |||
// Original problem is Levenstein distance. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue-46332.rs
(with the hyphen) as a file name would conform more with most of the other test file names. The license also needs to be at the top of the file (the build system will check).
src/librustc_resolve/lib.rs
Outdated
|
||
|
||
// Ugly code, just to see if using case insensitive comparison will help | ||
let exact_match = names.iter().find(|x| x.as_str().to_uppercase() == name.as_str().to_uppercase()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case_insensitive_match
would be a less-misleading name.
src/librustc_resolve/lib.rs
Outdated
Some(found) => Some(found.clone()), | ||
_ => None, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less nesting:
if case_insensitive_match.is_some() {
return case_insensitive_match.cloned()
}
src/librustc_resolve/lib.rs
Outdated
_ => None, | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that we should likely be making this change as a "tiebreaker" in syntax::util::lev_distance::find_best_match_for_name
itself: the only time case-insensitive match is going to do better than minimizing Levenshtein distance is when there's a tie for the minimum and the method makes an unlucky choice among the minima, as in the motivating example (TyInt
and TyUint
are both Levenshtein-distance 1 away from TyUInt
).
It looks very silly when you write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the comments above.
src/librustc_resolve/lib.rs
Outdated
Some(found) => Some(found.clone()), | ||
_ => None, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can be written as
if let Some(found) = exact_match {
return Some(found.clone());
}
src/librustc_resolve/lib.rs
Outdated
|
||
|
||
// Ugly code, just to see if using case insensitive comparison will help | ||
let exact_match = names.iter().find(|x| x.as_str().to_uppercase() == name.as_str().to_uppercase()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the following to account for @petrochenkov's comment
names.iter().find(|x| {
x.as_str().to_uppercase() == name.as_str().to_uppercase() && x.as_str() != name.as_str()
})
src/test/ui/issue46332.rs
Outdated
|
||
fn main() { | ||
TyUInt(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing the .stderr
file in this PR.
struct TyInt {} | ||
|
||
fn main() { | ||
TyUInt {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add a comment below this to test for errors that do not care for the ui output:
//~^ ERROR cannot find struct, variant or union type `TyUInt` in this scope
The lack of this is why the test failed. It was made mandatory recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Fixed the build.
@bors r+ |
📌 Commit f18446e has been approved by |
An alternative would be to adjust the distance formula to incorporate case differences, e.g. insertion/deletion/arbitrary substitution would still add |
…ebank Add case insensitive comparison, besides Levenstein for DYM Closes #46332 Draft version. The idea is that Levenstein does not work for some cases when we have multiple equal weights for strings. I didn't understand the case with `if found != name => Some(found)` so it means that new code does not work correctly yet. At least now I think that we might return all maximal weights from levenstein and think about next cases in priority order: 1) There is exact match -> None 2) There is exact match, but case insensitive -> Some(match) 3) There is some match from levenstein -> Some(matches.take_any) 4) There is no match -> None @estebank WDYT?
☀️ Test successful - status-appveyor, status-travis |
…match-names, r=estebank Suggest a case insensitive match name regardless of levenshtein distance Fixes rust-lang#86170 Currently, `find_best_match_for_name` only returns a case insensitive match name depending on a Levenshtein distance. It's a bit unfortunate that that hides some suggestions for typos like `Bar` -> `BAR`. That idea is from rust-lang#46347 (comment), but I think it still makes some sense to show a candidate when we find a case insensitive match name as it's more like a typo. Skipped the `candidate != lookup` check because the current (i.e, `levenshtein_match`) returns the exact same `Symbol` anyway but it doesn't seem to confuse anything on UI tests. r? `@estebank`
…match-names, r=estebank Suggest a case insensitive match name regardless of levenshtein distance Fixes rust-lang#86170 Currently, `find_best_match_for_name` only returns a case insensitive match name depending on a Levenshtein distance. It's a bit unfortunate that that hides some suggestions for typos like `Bar` -> `BAR`. That idea is from rust-lang#46347 (comment), but I think it still makes some sense to show a candidate when we find a case insensitive match name as it's more like a typo. Skipped the `candidate != lookup` check because the current (i.e, `levenshtein_match`) returns the exact same `Symbol` anyway but it doesn't seem to confuse anything on UI tests. r? ``@estebank``
Closes #46332
Draft version. The idea is that Levenstein does not work for some cases when we have multiple equal weights for strings. I didn't understand the case with
if found != name => Some(found)
so it means that new code does not work correctly yet.At least now I think that we might return all maximal weights from levenstein and think about next cases in priority order:
@estebank WDYT?