Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lint group infra #960

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

suaviloquence
Copy link
Contributor

Adds:

  • the Identifier enum for consistent configuration with lint groups and lints
  • OverrideStack::try_push which tries to compile a Map<Identifier, Override> to a Map<String (= lint id), Override>, and errors on a conflict
  • the must_use_added lint group
  • currently the lint_group field on SemverQuery is optional while we figure out what lints go in what groups (and what groups to include). must_use_added is an example from my design draft, but we can definitely go in a different direction. Eventually the goal is to make lint_group not an Option, so every lint is associated with exactly one lint group
  • some tests + documentation, and user documentation in the README

impl Identifier {
/// Returns a list of lint ids that this identifier refers to, given the set of
/// all queries.
pub(crate) fn lints<'a>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If/when we add dynamic groups like warnings, major, and minor, this will need an extra &OverrideStack argument to determine this. We call this method in OverrideStack::try_push, so this will be an easy change.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes like this are excellent candidates to put into the source code directly. We won't remember to check this PR if/when we add groups like warnings/major/minor, but we'll probably find the note in the source code.

Suggested change
pub(crate) fn lints<'a>(
//
// N.B.: If we add dynamic groups like `warnings`, `major`, or `minor`, this function
// will require an additional `&OverrideStack` argument.
pub(crate) fn lints<'a>(

If possible, expand the above suggestion with more detail on what that extra argument would be used for and how. Right now, it isn't immediately clear to me what it would do.

pub(crate) fn lints<'a>(
&'a self,
all_queries: &'a BTreeMap<String, SemverQuery>,
) -> Vec<&'a str> {
Copy link
Contributor Author

@suaviloquence suaviloquence Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also make this a Vec<&'a SemverQuery>, but we would need to handle the case where an unknown lint id was provided (but this may be a good thing, currently there is no warning for this). This would cause a slight mismatch in the overridestack API though, which currently works off of lint ids.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identifier doesn't quite strike me as the right place for this logic. We're giving the larger pool of data (&'a BTreeMap<String, SemverQuery>) to the smaller type (a two-variant enum) and asking it to figure it out.

Also, I feel like we want the "resolve the identifiers" logic to have Result types so we can complain about bad IDs more explicitly. And we really need to test those cases before someone gets bitten by a silent bug around bad IDs...

for lint in lints {
if let Some(entry) = compiled_map.get_mut(lint) {
if let Some(lint_level) = overrides.lint_level {
if let Some(old) = entry.lint_level {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is an error even if lint_level == old. Should this be the case? Maybe just a warning?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When might this happen?

If I'm understanding it correctly, there's no conflict since both sides are setting the same value, no?

Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I think this needs a lot more design on the UX side of things first, before we get into the weeds of how it's implemented under the hood.

  • We need to explicitly enumerate which error cases happen when.
  • We need super high-quality error messages in each case, pointing to the specific places that caused the problem and offering concrete next steps.
  • We need to consider what happens when we add more lint groups, add new lints to various combinations of lint groups, or newly add existing lints to existing groups. Edge cases are everywhere!
  • We need to consider the "upgrade experience" of `cargo-semver-checks. If we aren't extremely careful, adding a new lint will be a major breaking change and will anger our biggest users first — the people we want to keep the most happy.

I think taking a step back from the code and producing something more akin to a design doc would be a great next step.

impl Identifier {
/// Returns a list of lint ids that this identifier refers to, given the set of
/// all queries.
pub(crate) fn lints<'a>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes like this are excellent candidates to put into the source code directly. We won't remember to check this PR if/when we add groups like warnings/major/minor, but we'll probably find the note in the source code.

Suggested change
pub(crate) fn lints<'a>(
//
// N.B.: If we add dynamic groups like `warnings`, `major`, or `minor`, this function
// will require an additional `&OverrideStack` argument.
pub(crate) fn lints<'a>(

If possible, expand the above suggestion with more detail on what that extra argument would be used for and how. Right now, it isn't immediately clear to me what it would do.

@@ -103,6 +103,10 @@ pub struct SemverQuery {
/// The default lint level for when this lint occurs.
pub lint_level: LintLevel,

/// The [`LintGroup`] that this query is a member of (currently optional).
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The [`LintGroup`] that this query is a member of (currently optional).
/// The [`LintGroup`] that this query is a member of, if any.

Also, should this be a Vec<LintGroup> by any chance?

Might there be a situation where we want more than one group per lint? And if so, is the existing priority system sufficient to handle group conflicts?

Comment on lines +472 to +474
Note that `function_must_use_added` still has a `major` required update bump, because the entry with
priority -1 left the `required-update` to its default value, so `cargo-semver-checks` will use the
entry with less precedence to calculate the required version bump.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a succinct, memorable way to describe this algorithm.

How about something like:

Suggested change
Note that `function_must_use_added` still has a `major` required update bump, because the entry with
priority -1 left the `required-update` to its default value, so `cargo-semver-checks` will use the
entry with less precedence to calculate the required version bump.
The final values for the `level` and `required-update` fields are computed separately, each in priority order.
If the entry with the most-negative `priority` doesn't set both `level` and `required-update` values, the final configuration may have elements of multiple config entries. This happens
in the above example, where `function_must_use_added` still has a `major` required update bump:
- The entry with priority -1 did not override the `required-update` field. We keep looking.
- In priority order, the next entry is the one with priority 0. It set `required-update = "major"`.
- The lint uses `required-update = "major"`.

Comment on lines +437 to +443
When there are multiple conflicting entries for a given lint (for example, configuring the
`must_use_added` group containing the `function_must_use_added` lint and `function_must_use_added` itself),
`cargo-semver-checks` needs more information to determine which configuration entry takes
precedence. Like the Cargo `[lints]` table, each configuration entry has an optional
`priority` field that defines the order of configuration entries. If an entry has no
`priority` key, its priority is 0 by default. Entries with a lower (more negative)
`priority` override those with a higher `priority` when there is a conflict.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider explaining first, then providing an example, rather that doing both at once. Weaving lint groups and a specific lint into the middle of the explanation leaves readers with a lot of information to track.

Consider adapting the explanation I used below if you decide you like it. Something along the lines of "for each setting in each lint, we inspect config items in priority order and pick the first one that sets a value for that setting."

Comment on lines +171 to +172
// TODO: replace with LazyLock once MSRV is >= 1.80
pub(crate) static ALL_QUERIES: OnceLock<BTreeMap<String, SemverQuery>> = OnceLock::new();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike globals like this. They make testing a mess.

Can you think of any reasonable way we can avoid this?

for lint in lints {
if let Some(entry) = compiled_map.get_mut(lint) {
if let Some(lint_level) = overrides.lint_level {
if let Some(old) = entry.lint_level {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When might this happen?

If I'm understanding it correctly, there's no conflict since both sides are setting the same value, no?

///
/// Returns an `Err` if there is a configuration conflict in the passed map: when
/// multiple entries in `item` configure the same entry for the same lint. To handle
/// configuration precedence in a defined way, call this function multiple times, and
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defined way is ominously vague here, and it's in the middle of a very long sentence which makes me feel it's unlikely I'd manage to use this function properly =/

if let Some(entry) = compiled_map.get_mut(lint) {
if let Some(lint_level) = overrides.lint_level {
if let Some(old) = entry.lint_level {
anyhow::bail!("Configuration conflict detected for lint `{lint}`:\n\
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our bail errors are generally lowercase since bail!(<stuff>) gets printed as error: <stuff> when the program exits. If I'm right about that, any capitalized error messages should be tested + changed.

if let Some(lint_level) = overrides.lint_level {
if let Some(old) = entry.lint_level {
anyhow::bail!("Configuration conflict detected for lint `{lint}`:\n\
lint level is set to `{old:?}` and {lint_level:?} at the same config priority.");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunately a supremely non-useful error message: it tells the user "go figure it out" without actually saying which entries are the ones at that priority. The entries might not be easy to find, since they might be on the lint itself, or any of its groups.

This would be very frustrating to encounter.

expression: result
---
--- error ---
add a `priority` key to the entries in [package.metadata.cargo-semver-checks.lints].
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a great error message.

Imagine the user's Cargo.toml has 100 config entries. Suddenly, one of them caused a conflict. (Maybe it's because we added a new lint that is the first one to be in a particular combination of two groups simultaneously? The user had specified settings for each group, and they didn't conflict in our previous release. In the new release with the new lint, they do! Oops!)

Now we just told the user "hey please add priority to all your 100 config entries." They'll uninstall the tool immediately.

@obi1kenobi obi1kenobi linked an issue Oct 1, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create configurable groups of lints
2 participants