Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeMap::remove_file #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stepancheg
Copy link
Contributor

Useful for long running processes with state (think REPL).

Current implementation is linear, can be optimized later to be
O(1).

Useful for long running processes with state (think REPL).

Current implementation is linear, can be optimized later to be
`O(1)`.
@kevinmehall
Copy link
Owner

So far the goal has been to make it hard to obtain invalid spans, while going this direction probably means that more functions should return Option. What happens if a line in your REPL defines a function that is later involved in an error and needs a span lookup?

I worry about adding methods for operations the datastructure isn't really designed for. The "index into concatenated files" technique that CodeMap uses is a better fit for batch compilers. For things like language servers that may process more code than a REPL, the 32-bit position space could easily run out. I believe rustc incremental compilation used in RLS stores uncompressed spans across requests, and uses a fresh CodeMap only within the code that updates the index for a crate.

@stepancheg
Copy link
Contributor Author

What happens if a line in your REPL defines a function that is later involved in an error and needs a span lookup?

A function may have a reference to a struct which references File and CodeMap, and remove that File when last reference to File is dropped.

32-bit position space could easily run out

Yep, I think it should be 64-bit.

fresh CodeMap

I was thinking about having own struct like

struct SpanWithCodeMap { span, Rc<CodeMap> }

But unfortunately codemap-diagnostic crate heavily depends on Span without context (e. g. struct Diagnostic or emitter which requires all spans to belong to single code map), so it's hard to use that struct without rewriting copy-pasting a lot codemap-diagnostic.

That's why I came up with this solution.

@stepancheg
Copy link
Contributor Author

@kevinmehall
Copy link
Owner

Yeah, I think you're running into some of the same limitations that I have. The "index into concatenated files" thing from rustc seems clever but is probably premature optimization for many cases.

One change I've been thinking about is to make Arc<File> more like a single-file CodeMap that could be used standalone or added to multiple different CodeMaps. For something like an AST, where everything comes from a single source file, it would work to store the Arc<File> next to the AST root (or next to each function AST, or whatever), and have the AST contain spans relative to that file. That would make it easier to create, destroy, and replace ASTs without mutating a global CodeMap. When lowering the AST into something that contains spans from multiple files, the files could be added to a CodeMap and the spans translated. In the language I'm working on, that CodeMap would be short-lived and more like a batch process, but I'm not sure that's universal.

I think the codemap-diagnostic API should replace its use of CodeMap with a trait that could be implemented for CodeMap, File, or external custom types. I was thinking that Span should be an associated type of that trait, but you bring up a good point that it may be limiting to require all spans within the same diagnostic message to come from the same CodeMap. One complication there is it must be able to relate two spans to each other if they come from the same file, because it can display multiple annotations in one snippet.

@stepancheg
Copy link
Contributor Author

OK, there is a couple of ways to fix this issue. Do you know how to fix it without breaking backward compatibility? (Codemap-diagnostic follows semver conventions I presume.)

@Lymia
Copy link

Lymia commented Jul 13, 2019

I'm currently working on a project that codemap-diagnostic would be very useful for, however, I've also found that many of the design decisions made in codemap are not ideal for my particular project or coding style. And, frankly, there will never be a one-size-fits-all answer. Just, perhaps, "the simplest one that works" as a default.

A lot of these trade-offs were made intentionally for one reason or another in rustc, and have changed over times. There's a lot of projects that would want the ready-made (and great) compiler error emitter, but may want to make very different tradeoffs in the code map/span encoding.

I'd be willing to work on making codemap-diagnostic agnostic to the underlying code map implementation to help with such issues.

@kevinmehall
Copy link
Owner

Let's move the codemap-diagnostic side of this over to an issue on that repo: kevinmehall/codemap-diagnostic#7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants