Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace match_def_path and friends with is_item #7647

Closed
wants to merge 3 commits into from

Conversation

Jarcho
Copy link
Contributor

@Jarcho Jarcho commented Sep 8, 2021

This is a pair of new util functions is_item and is_any_item. These will take anything that can be turned into a DefId (currently DefId, Res, (QPath,HirId), Expr, Pat, and Ty) and anything that can refer to a specific item (currently a def path, diagnostic item and lang item). This replaces the handful of functions we currently have to do certain combinations of these.

The following functions are then removed:

  • is_qpath_def_path
  • is_expr_path_def_path
  • is_expr_diagnostic_item
  • match_any_def_paths
  • match_any_diagnostic_items
  • match_def_path
  • is_type_diagnostic_item
  • is_type_lang_item
  • match_type

The internal lint match_type_on_diagnostic_item has been changed to work on the new function. It will also check consts and statics from external crates and check for lang items.

changelog: none

@rust-highfive
Copy link

r? @giraffate

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Sep 8, 2021
@Jarcho Jarcho force-pushed the is_item branch 2 times, most recently from 3fc960d to d4cc838 Compare September 8, 2021 21:07
@flip1995
Copy link
Member

flip1995 commented Sep 9, 2021

This is a really big refactor and bitrotty. So I don't want to put all of the burden of reviewing this on @giraffate. So first @rust-lang/clippy what are your thoughts on this? And then, if you have time, please pick a part of the PR and help reviewing it.

@Jarcho
Copy link
Contributor Author

Jarcho commented Sep 9, 2021

I'll split this up into two PRs. One with just the switchover to is_item and another with all the refactoring around it. Should make it easier to review.

* `is_qpath_def_path`
* `is_expr_path_def_path`
* `is_expr_diagnostic_item`
* `match_any_def_paths`
* `match_any_diagnostic_items`
* `match_def_path`
* `is_type_diagnostic_item`
* `is_type_lang_item`
* `match_type`
@Jarcho Jarcho force-pushed the is_item branch 4 times, most recently from 2f5b2ed to 6b64159 Compare September 9, 2021 16:01
* Check for `is_item` instead
* Check consts and statics from external crates
* Check for lang items
* Check for inherent functions which have the same name as a field
@Jarcho
Copy link
Contributor Author

Jarcho commented Sep 9, 2021

Pulled out the refactorings. Only thing left is the switch to is_item and is_any_item which were all automated minus the manual fixups to the imports. Should be much easier to review now.

@camsteffen
Copy link
Contributor

We definitely need to simplify our approach to item lookups. I've also toyed with creating utils like this. I'm a little unsure that this is the right direction though. Lately I've been thinking that we should simply move towards diagnostic items.

In particular, I would really like to have a diagnostic item "reverse lookup" added to rustc:

match cx.tcx.get_diagnostic_name(def_id)? {
    sym::vec_type => ..

If we do go with this approach, I would prefer that is_any_item returns bool instead of Option<usize>. To me returning an index feels un-Rusty, like C-style for loops.

@Jarcho
Copy link
Contributor Author

Jarcho commented Sep 10, 2021

That would definitely work better when checking for multiple diagnostic items.

There's still a need to simplify getting the DefId in the first place. Things like this happen all the time:

if_chain! {
    if let ExprKind::Call(fn_expr, args) = e.kind;
    if let ExprKind::Path(ref p) = fn_expr.kind;
    lf let Some(def_id) = cx.qpath_res(p, fn_expr.hir_id).opt_def_id();
    if cx.tcx.is_diagnostic_item(_, def_id);
    then {
    	// do stuff with args
    }
}

Lang items can also use a simplified check as well. cx.tcx.lang_items().item().map_or(false, |lang_id| lang_id == def_id) is a little wordy. Paths will also still need to be handled for external crates (e.g. itertools and regex) as well.

If we do go with this approach, I would prefer that is_any_item returns bool instead of Option<usize>. To me returning an index feels un-Rusty, like C-style for loops.

The name could be better (it's just wrong). There are quite a few cases where knowing which one matched is necessary though.

Edit:
Just a note about reversing the diagnostic item lookup (before I forget about this). They're also used to lookup the DefId for a specific trait (Display, Ord, DoubleEndedIterator and RangeBounds are used in clippy). Looks like this is always done to check if a type implements a specific trait. This could still be done if diagnostic item lookup is flipped, but it would require looking up the diagnostic name of every trait the type implements.

@bors
Copy link
Collaborator

bors commented Sep 11, 2021

☔ The latest upstream changes (presumably #7663) made this pull request unmergeable. Please resolve the merge conflicts.

@camsteffen
Copy link
Contributor

camsteffen commented Sep 12, 2021

Here is a util idea I have

fn is_path_to_item(cx, impl MaybePath, impl ItemRef);

impl MaybePath is Expr | Pat | QPath. This has a smaller surface area than is_item, but it's more expressive. I would like preserving the "path to" concept in code.

This could be complimented with is_item_id(cx, impl ItemRef, DefId).


There are quite a few cases where knowing which one matched is necessary though.

You could split into multiple invocations of is_item/is_any_item. Using indices is harder to read and more error prone IMO.

@Jarcho
Copy link
Contributor Author

Jarcho commented Sep 12, 2021

That still leaves Res and Ty out, but they could be fit in pretty easily.

My issue with splitting the functions is justifying why they're two separate functions. As in from the perspective of the person calling the function, what problem are you solving by having two functions as opposed to one. I can kind of do that for expressions and patterns just by being clear that they only work on paths rather than, for example, method calls (it would make the function useless as it can't disambiguate between a path to a method vs. a call to a method, but I could see someone possibly using it to check a method call node without really thinking). At this point the cost is the same either way. Either you remember that paths use a different function (for is_path_to_item), or the function only works on paths for expressions and patterns (for is_item).

If we do go with splitting the functions, please don't have different argument orders. Either one works, but differing orders are just pointless friction.

@camsteffen
Copy link
Contributor

That still leaves Res and Ty out, but they could be fit in pretty easily.

True. I think Res is mostly an intermediate step for the other cases so it would be three widely used functions. Potentially an "any" variant for each one, but I'd also be fine with just using Iterator::any.

I'm torn on is_item. I do like the simplicity on one hand. But on the other hand it seems too ambiguously defined. It doesn't support ExprKind::MethodCall, but it could, and it's not obvious to me where that line should be drawn.

If we do go with splitting the functions, please don't have different argument orders. Either one works, but differing orders are just pointless friction.

Agreed. I would apply the "yoda condition" rule which would put ItemRef after the thing being checked.

@Jarcho
Copy link
Contributor Author

Jarcho commented Sep 15, 2021

It doesn't support ExprKind::MethodCall, but it could, and it's not obvious to me where that line should be drawn.

There is a line here. Checking for a ufcs call to Option::unwrap would look like this

if let ExprKind::Call(fn, _) = e.kind {
    is_item(cx, fn, &OPTION_UNWRAP)
} else {
    false
}

If we allow method calls this could match either Option::unwrap(foo) or foo.unwrap()() which are two very different things. So there are a small number of cases where it doesn't give a useful result. I would say knowing this would make the line clear, but the reason isn't obvious.


I'd be in favour of a good abstraction around method and function calls. There is match_function_call, but I would like something that matches the arguments before checking the DefId.

@camsteffen
Copy link
Contributor

You make a compelling case for excluding ExprKind::MethodCall. It's unfortunate that the reason doesn't stem from the meaning of is_item or MaybeDefId.

I think is_item would work fine in practice. My preference still would be to have more tightly defined abstractions.

@Jarcho
Copy link
Contributor Author

Jarcho commented Sep 18, 2021

I can't really think of a single name that gets across subtleties like that.

Just some other things to consider:

  • Pat can work on PatKind::TupleStruct and PatKind::Struct as well. There's no ambiguity here as they both contain a path directly.
  • Expr can work with ExprKind::Struct for the same reason as above.
  • Various other types like AdtDef and VariantDef could also be fit in here. Not really a big deal as they have a def_id field.

@camsteffen
Copy link
Contributor

Discussed this in the Clippy meeting today. We agreed on the following.

  1. We'd like to move towards just using diagnostic items.
  2. is_item feels like too much "magic" or over-abstraction
  3. There may be opportunity to introduce new utils in this area, but before doing that, we should migrate more paths to diagnostic items and adopt get_diagnostic_name which is now landing soon.

Even if we don't adopt is_item, this PR is a valuable exploration of what is possible with Clippy utils. 👍

@Jarcho
Copy link
Contributor Author

Jarcho commented Oct 6, 2021

I'll pull out the fixes for the internal lint into a different PR since those will be needed anyways. Should I open an issue summarizing things and possible steps forward?

We'd like to move towards just using diagnostic items.

I don't think this is possible for external crates (currently we use regex and itertools). Even if we could get external crates tagged with diagnostic items old versions still wouldn't have them.

There may be opportunity to introduce new utils in this area, but before doing that, we should migrate more paths to diagnostic items and adopt get_diagnostic_name which is now landing soon.

Is that replacing diagnostic_items or in addition to it?

@camsteffen
Copy link
Contributor

Should I open an issue summarizing things and possible steps forward?

Go for it! For reference we have #5393 for migrating to diagnostic items and I just opened #7784 for get_diagnostic_name.

We'd like to move towards just using diagnostic items.

I don't think this is possible for external crates (currently we use regex and itertools). Even if we could get external crates tagged with diagnostic items old versions still wouldn't have them.

True, there will be some exceptions. We can just implement those cases more "naively" instead of having utils. And/or we can separate path utils into a separate module to declutter.

Is that replacing diagnostic_items or in addition to it?

In addition - for cases where multiple queries can be replaced with one query.

@Jarcho Jarcho closed this Jul 19, 2022
bors added a commit that referenced this pull request Oct 2, 2022
…ogiq

Fix and improve `match_type_on_diagnostic_item`

This extracts the fix for the lint out of #7647. There's still a couple of other functions to check, but at least this will get lint working again.

The two added util functions (`is_diagnostic_item` and `is_lang_item`) are needed to handle `DefId` for unit and tuple struct/variant constructors. The `rustc_diagnostic_item` and `lang` attributes are attached to the struct/variant `DefId`, but most of the time they are used through their constructors which have a different `DefId`. The two utility functions will check if the `DefId` is for a constructor and switch to the associated struct/variant `DefId`.

There does seem to be a bug on rustc's side where constructor `DefId`s from external crates seem to be returning `DefKind::Variant` instead of `DefKind::Ctor()`. There's a workaround put in right.

changelog: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants