Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Naga's error representation be the public error API? #4429

Open
ErichDonGubler opened this issue May 2, 2023 · 3 comments
Open

Should Naga's error representation be the public error API? #4429

ErichDonGubler opened this issue May 2, 2023 · 3 comments
Assignees
Labels
kind: diagnostics Error message should be better naga Shader Translator

Comments

@ErichDonGubler
Copy link
Member

ErichDonGubler commented May 2, 2023

Naga currently exposes its internal representation of shader compilation errors as a public API (which I'll call being “having transparent error types”). The main reason we're interested in being “transparent” with error details is because there's lots of tooling that we want to mature on top of Naga, like language servers and error localization. The space of this tooling is still open-ended to us, and we're not sure what tooling in that space will actually need, yet.

The above has some maintenance costs. Every change to these error types is tantamount to a public API decision. This can force what might otherwise be “mostly” internal compiler development to suffer API design decision-making each time. Even adding new error cases can break code downstream. Phrased another way: Should changes to internal error representation be a SemVer-breaking change?

We could create a more “opaque” API that hides Naga's internal error representation in order to alleviate the above. However, making a formal API for exposing “interesting” error details has its own costs:

  • We would have to design an API that we're confident an ecosystem can be built with. This will require a significant up-front effort in designing an API for a large set of existing potential use cases. Because we're likely to not get that right on the first try, this also implies a few iterations of breaking changes to make an API capturing those use cases.
  • Over time, we may run the risk of not exposing interesting error information for use cases we haven't yet considered. This, in turn, might stifle the ecosystem we want to let the community grow.

This issue is intended to host discussion of the above points, and log a decision point in one of the noted directions.

Use cases that folks have been kind enough to elaborate on:


Motivating context: #4494

@dannymcgee
Copy link

dannymcgee commented May 11, 2023

To add a little more context from my side, here's how I'm currently translating Naga's validation and parse errors into LSP diagnostics, and here's where I'm invoking Naga's parser and validator. (Please ignore my horrifying application architecture, I have basically no idea what I'm doing.)

Some miscellaneous notes:

I recently brought this project up to date with Rust's 2021 edition after not touching it for like a year and a half, and I was shocked by how little I needed to touch the Naga-related code after upgrading from some random commit SHA to the latest release version. I can only speak for myself, but I would say that while I'm grateful for the API stability, I don't think anyone could blame you for making breaking changes more frequently, especially given how new and dynamic the WebGPU spec is currently.

I'm using very little of the exposed API surface currently and still producing valuable feedback for development. I'm not doing anything to resolve the numeric type references in the error strings at the moment (there's a long list of features I'm aiming to support, so diagnostics haven't gotten a ton of individual attention yet), but looking over the other fields present on the error types, it seems like it would be pretty easy to do so.

Speaking generally as a developer who tens to be wary of third-party dependencies, I really appreciate the granularity and flexibility of the APIs that are exposed here. Prior to this, I had experimented with building a similar tool for GLSL, and my only option for producing diagnostics was to create temporary files on the file system that contained the pre-processed code, run a CLI executable against those files, and manually parse the output string for line numbers, etc. It was a nightmare and I never got it working reliably.

I have plans to build out some new features on top of WGSL by using a preprocessor, and this API exposes enough information that I feel pretty confident in my ability to do some pretty powerful things without needing to fork the validator. For example, I could mangle type and function names in the preprocessed code that's provided to Naga, and be able to map them back to the user-defined names in the original source files without a lot of fuss.

My gut feeling is that the depth of API exposure here is basically what marks the difference between a library and an application. The GLSL validator I mentioned earlier is an application — it has one job, it does it through a tiny, sharply defined API, and if you want to do anything outside the box with it you have to expect to deal with the associated headaches of that. You're probably better off just forking it or rebuilding its functionality from scratch. But Naga being a library means that I can import it in my own application code and make use of its various components to build out my own functionality with a lot of flexibility. That's kind of its whole value proposition, IMO. If folks need a simple tool that just takes some WGSL code and prints some nicely formatted error messages to the terminal, that's certainly valid, and one could leverage Naga to build a CLI application that does exactly that (EDIT: TIL Naga CLI is basically that — forgive me for being out of the loop). But Naga is powerful enough to do a whole lot more than that — and isn't that the whole point?

@jimblandy
Copy link
Member

jimblandy commented May 12, 2023

Naga is not yet at the point of being able to offer a stable API, worry about semver guarantees, and so on. We'll definitely be breaking a lot as we catch up with WebGPU v1. For example, gfx-rs/naga#2266 and gfx-rs/naga#2309 are both going to be substantial changes to the IR; there's more to come like that.

Even after we achieve spec compliance, I'm pessimistic. Keeping an IR stable is much more restrictive to the implementors than keeping, say, WGSL's syntax backwards-compatible. This is why, for example, the regex crate doesn't have any public API that accepts regex_syntax::Ast or regex_syntax::Hir values as input: if it did, then those would become part of regex's public API. Or: ask anyone who has maintained LLVM-based code for a long time what it's like upgrading to a new version of LLVM.

This doesn't mean we can't respect the semver rules. We just need to set expectations around going through sixty major versions every year. I don't think that's necessarily a problem (although we'll need to deploy rust-semver or something equivalent to keep us honest).

So I think we have pretty free rein to experiment with error representations.

@jimblandy
Copy link
Member

Thinking about this a bit more:

It may be the case that there are many consumers that don't need full access to the IR, who simply want to be able to pass in a string and get another string back out, with some basic reflection data. We certainly intend to keep the WGSL we accept backwards-compatible, for Firefox's use case where breaking existing shaders would break web sites, who often have nobody available to fix them.

For such users, it might make sense to offer an stabler-naga crate (please think of a better name) that offers a more restricted API than Naga itself and tries not to break it. There's no reason this crate couldn't keep up with the latest and greatest Naga without changing its own semver. Ideally, users like Bevy could confine themselves to stabler-naga in exchange for fewer breaking changes.

I don't know what Bevy (say) needs from Naga, though, so we don't really know what would belong in the stabler-naga API.

@cwfitzgerald cwfitzgerald transferred this issue from gfx-rs/naga Oct 25, 2023
@cwfitzgerald cwfitzgerald added the naga Shader Translator label Oct 25, 2023
@ErichDonGubler ErichDonGubler self-assigned this Jan 22, 2024
@ErichDonGubler ErichDonGubler added the kind: diagnostics Error message should be better label Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: diagnostics Error message should be better naga Shader Translator
Projects
None yet
Development

No branches or pull requests

4 participants