Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update handling of parser errors #3920

Merged
merged 2 commits into from
Nov 27, 2019

Conversation

calebcartwright
Copy link
Member

A couple minor updates to the handling of parser errors in order to cover a few additional scenarios. Follow up to #3782 and refs #3916 (comment)

  • now when the parser error level is Fatal, the parse error details will be emitted (unless hide_parse_errors is true), regardless of whether the source file is ignored because the parser won't be able to give rustfmt the ast anyway in the event of a fatal error. Previously rustfmt was suppressing the fatal parser error text if the source file was ignored even though rustfmt wouldn't be able to continue and would have to exit with an error.
  • even when hide_parse_errors is true, the same logic checks are applied to see if DB source file origin was in an ignored file so that the same error reset logic can be used to ignore the recoverable parser errors

@@ -76,3 +76,7 @@ version = "610.0.0"
package = "rustc-ap-syntax_pos"
version = "610.0.0"

[dependencies.rustc_data_structures]
package = "rustc-ap-rustc_data_structures"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already a part of the dependency tree (since nearly every rustc-ap-* package uses it), but adding it directly to gain access to one of the traits for use in the parser emitter.

Box::new(SilentEmitter {})
}

/// Emit errors against every files expect ones specified in the `ignore_path_set`.
struct SilentOnIgnoredFilesEmitter {
ignore_path_set: Rc<IgnorePathSet>,
source_map: Rc<SourceMap>,
emitter: EmitterWriter,
emitter: Box<dyn Emitter + sync::Send>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change also makes it a lot easer to unit test the emitter logic vs explicitly using the EmitterWriter struct

Copy link
Contributor

@topecongiro topecongiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! However, I quite do not understand the motivation behind this PR. We should not surface any parsing errors from ignored files as they tend to contain malformed syntax.

the parser won't be able to give rustfmt the ast anyway in the event of a fatal error

rustfmt doesn't parse the ignored files, so this shouldn't be a problem.

@calebcartwright
Copy link
Member Author

calebcartwright commented Nov 14, 2019

However, I quite do not understand the motivation behind this PR

This is basically some fixes/follow ups to the change from #3782 where we added support to not have rustfmt exit with an error when it encounters recoverable parser errors that originate from ignored files. There was also some relevant conversation around that original change in #3779 and #3777 for historical context. There's two core fixes in this PR:

incorrect handling of parser errors when hide_parse_errors is true

Currently, when hide_parse_errors is set to true, rustfmt will still exit with an error (incorrectly) even if all the parser errors were recoverable and originated from ignored files.

This is beacuse the change we added in #3782 does not get utilized when hide_parse_errors is enabled because I somehow forgot that config option resulted in an entirely different emitter being used 🤦‍♂

This PR fixes that problem.

This bug can be reproduced today (on master) by adding hide_parse_errors = true to this config file used in the test suite: https://github.com/rust-lang/rustfmt/blob/master/tests/config/issue-3779.toml
and then running the tests. The corresponding system and idempotence tests for https://github.com/rust-lang/rustfmt/tree/master/tests/source/issue-3779 will fail

silently erroring on fatal parser errors, even when hide_parse_errors is false

The change in #3782 ensures that rustfmt will suppress parser errors from ignored files. However, it currently does so regardless of whether that parser error is actually recoverable/resettable.

This means that when rustfmt encounters a fatal error from an ignored file it is silently exiting with an error even when hide_parse_errors is false. I personally think that can be a source of confusion for users (#3916 as an example).

My proposed change here is that if there is a fatal/nonrecoverable parser error then rustfmt should not suppress error message when hide_parse_errors is false even if the error came from an ignored file, because rustfmt will not be able to ignore/reset the parse error.

Does that help clarify the intent?

@topecongiro
Copy link
Contributor

@calebcartwright Thank you for the explanation! It helped me understand the intention of this PR and the actual problem 😃

Now that I think about it, it seems that the root cause of the problem is how we parse files and create mappings between filenames and AST nodes.

In the current implementation, first, in parse_crate, rustfmt parses the entire module tree and construct a complete AST. Then, in ModResolver, it traverses the AST and creates a mapping from filename to the AST node.

This approach has two problems:

  1. Since we construct a complete AST in parse_crate, if the target module tree contains unparsable files, we fail to construct an AST and fail silently, even if the file is part of ignore. (this is what @calebcartwright has described in the comment, thank you!).
  2. Since we are constructing the complete AST, if some modules are missing, then we fail to construct an AST as well (e.g., mod foo; without foo.rs).

To fix the actual cause of the problem, we need to create a mapping from file names to the AST node without creating a complete AST. I think that this is possible by rolling out our module resolver.

I am thinking of something like the following:

// This stack keeps track of items inside a file.
let mut item_stack = vec![];
// Initialize the stack with the items in the entry file.
for item in main_file.sub_mods() {
    item_stack.push((item, main_file.file_name()));
}

// This is the mapping from filename to AST node.
let mut file_to_ast_node_map = BTreeMap::new();

while let mut Some((item, file_name)) = item_stack.pop() {
    if !item.is_mod() {
        continue;
    }
        // If the mod should be ignored, we won't parse it.
    if should_ignore(item) {
        continue;
    }
    // Try to parse the mod item. If succeed then we add a mapping.
    // Otherwise, we return an error or keep processing, depending on
    // the configuration.
    match parse_mod_item(item) {
        Ok(parsed) => souremap.insert(file_name, parsed),
            Err(err) => {
                if error_on_unparsable_file() {
                    return Err(err);
                } else {
                    continue;
                }
            }
        }
    }
}

Ok(file_to_ast_node_map)

Copy link
Contributor

@topecongiro topecongiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning this PR, this looks good to me as it fixes the bug and improves the user's experience. However, as I mentioned in the above comment, please note that we may need to reconsider how we parse the module tree, and hence, we may replace the relevant code entirely with the new approach.

@calebcartwright Again, thank you for your PR and explanation, it helped me a lot.

@calebcartwright
Copy link
Member Author

calebcartwright commented Nov 15, 2019

That makes a lot of sense to me @topecongiro. I presume we'd also need to always set recurse_into_file_modules to false on the raw syntax parser by default as part of that solution, and then use the rustfmt config options (ATM skip_children but soon to be recursive) as part of determining whether to iterate through the sub_mods right?

Otherwise, IIUC, the parser would still continue go through and construct the entire AST up front

@topecongiro
Copy link
Contributor

That makes a lot of sense to me @topecongiro. I presume we'd also need to always set recurse_into_file_modules to false on the raw syntax parser by default as part of that solution, and then use the rustfmt config options (ATM skip_children but soon to be recursive) as part of determining whether to iterate through the sub_mods right?

Yes!

@calebcartwright
Copy link
Member Author

Yes!
👍

If you'd like, I'll open a new issue to track the work of updating the parser flow per your above solution.

Though I expect I may get stuck along the way, I could look into trying to implement it as well.

My thinking is that it feels like it would be good to have this new parser solution in place before fully switching from skip_children to recursive (not necessary to do this first, but doing so would likely avoid some rework).

@topecongiro
Copy link
Contributor

If you'd like, I'll open a new issue to track the work of updating the parser flow per your above solution.

That helps a lot, thanks!

Though I expect I may get stuck along the way, I could look into trying to implement it as well.

❤️

You should probably utilize ModResolver as it already mimics the behavior of rustc's mod resolution.

My thinking is that it feels like it would be good to have this new parser solution in place before fully switching from skip_children to recursive (not necessary to do this first, but doing so would likely avoid some rework).

My understanding is that switching from skip_children to recursive is only about flipping the boolean condition, though I haven't looked at it closely yet. Have you found any concern?

@calebcartwright
Copy link
Member Author

My understanding is that switching from skip_children to recursive is only about flipping the boolean condition, though I haven't looked at it closely yet. Have you found any concern?

No, no concerns there. I was just thinking it'd be nice to avoid flipping the bool in a few places like this:

pub(crate) fn build(self) -> Result<Parser<'a>, ParserError> {
let config = self.config.ok_or(ParserError::NoConfig)?;
let sess = self.sess.ok_or(ParserError::NoParseSess)?;
let input = self.input.ok_or(ParserError::NoInput)?;
let mut parser = match Self::parser(sess.inner(), input, self.directory_ownership) {
Ok(p) => p,
Err(db) => {
sess.emit_diagnostics(db);
return Err(ParserError::ParserCreationError);
}
};
parser.cfg_mods = false;
if config.skip_children() {
parser.recurse_into_file_modules = false;
}
Ok(Parser { parser, sess })
}

since the entire check would then need to be removed as part of the new parser solution. Minor stuff like that though, so I'll go ahead and finish up the recursive switch first.

@karyon
Copy link
Contributor

karyon commented Oct 28, 2021

backported in #4100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants