Should Naga's error representation be the public error API? #4429

ErichDonGubler · 2023-05-02T17:22:57Z

Naga currently exposes its internal representation of shader compilation errors as a public API (which I'll call being “having transparent error types”). The main reason we're interested in being “transparent” with error details is because there's lots of tooling that we want to mature on top of Naga, like language servers and error localization. The space of this tooling is still open-ended to us, and we're not sure what tooling in that space will actually need, yet.

The above has some maintenance costs. Every change to these error types is tantamount to a public API decision. This can force what might otherwise be “mostly” internal compiler development to suffer API design decision-making each time. Even adding new error cases can break code downstream. Phrased another way: Should changes to internal error representation be a SemVer-breaking change?

We could create a more “opaque” API that hides Naga's internal error representation in order to alleviate the above. However, making a formal API for exposing “interesting” error details has its own costs:

We would have to design an API that we're confident an ecosystem can be built with. This will require a significant up-front effort in designing an API for a large set of existing potential use cases. Because we're likely to not get that right on the first try, this also implies a few iterations of breaking changes to make an API capturing those use cases.
Over time, we may run the risk of not exposing interesting error information for use cases we haven't yet considered. This, in turn, might stifle the ecosystem we want to let the community grow.

This issue is intended to host discussion of the above points, and log a decision point in one of the noted directions.

Use cases that folks have been kind enough to elaborate on:

@jasmine1717 on Matrix about Bevy preprocessing shaders to achieve a module system
@dannymcgee on Reddit, discussing an LSP they'd like to build

Motivating context: #4494

dannymcgee · 2023-05-11T21:23:26Z

To add a little more context from my side, here's how I'm currently translating Naga's validation and parse errors into LSP diagnostics, and here's where I'm invoking Naga's parser and validator. (Please ignore my horrifying application architecture, I have basically no idea what I'm doing.)

Some miscellaneous notes:

I recently brought this project up to date with Rust's 2021 edition after not touching it for like a year and a half, and I was shocked by how little I needed to touch the Naga-related code after upgrading from some random commit SHA to the latest release version. I can only speak for myself, but I would say that while I'm grateful for the API stability, I don't think anyone could blame you for making breaking changes more frequently, especially given how new and dynamic the WebGPU spec is currently.

I'm using very little of the exposed API surface currently and still producing valuable feedback for development. I'm not doing anything to resolve the numeric type references in the error strings at the moment (there's a long list of features I'm aiming to support, so diagnostics haven't gotten a ton of individual attention yet), but looking over the other fields present on the error types, it seems like it would be pretty easy to do so.

Speaking generally as a developer who tens to be wary of third-party dependencies, I really appreciate the granularity and flexibility of the APIs that are exposed here. Prior to this, I had experimented with building a similar tool for GLSL, and my only option for producing diagnostics was to create temporary files on the file system that contained the pre-processed code, run a CLI executable against those files, and manually parse the output string for line numbers, etc. It was a nightmare and I never got it working reliably.

I have plans to build out some new features on top of WGSL by using a preprocessor, and this API exposes enough information that I feel pretty confident in my ability to do some pretty powerful things without needing to fork the validator. For example, I could mangle type and function names in the preprocessed code that's provided to Naga, and be able to map them back to the user-defined names in the original source files without a lot of fuss.

My gut feeling is that the depth of API exposure here is basically what marks the difference between a library and an application. The GLSL validator I mentioned earlier is an application — it has one job, it does it through a tiny, sharply defined API, and if you want to do anything outside the box with it you have to expect to deal with the associated headaches of that. You're probably better off just forking it or rebuilding its functionality from scratch. But Naga being a library means that I can import it in my own application code and make use of its various components to build out my own functionality with a lot of flexibility. That's kind of its whole value proposition, IMO. If folks need a simple tool that just takes some WGSL code and prints some nicely formatted error messages to the terminal, that's certainly valid, and one could leverage Naga to build a CLI application that does exactly that (EDIT: TIL Naga CLI is basically that — forgive me for being out of the loop). But Naga is powerful enough to do a whole lot more than that — and isn't that the whole point?

jimblandy · 2023-05-12T05:55:11Z

Naga is not yet at the point of being able to offer a stable API, worry about semver guarantees, and so on. We'll definitely be breaking a lot as we catch up with WebGPU v1. For example, gfx-rs/naga#2266 and gfx-rs/naga#2309 are both going to be substantial changes to the IR; there's more to come like that.

Even after we achieve spec compliance, I'm pessimistic. Keeping an IR stable is much more restrictive to the implementors than keeping, say, WGSL's syntax backwards-compatible. This is why, for example, the regex crate doesn't have any public API that accepts regex_syntax::Ast or regex_syntax::Hir values as input: if it did, then those would become part of regex's public API. Or: ask anyone who has maintained LLVM-based code for a long time what it's like upgrading to a new version of LLVM.

This doesn't mean we can't respect the semver rules. We just need to set expectations around going through sixty major versions every year. I don't think that's necessarily a problem (although we'll need to deploy rust-semver or something equivalent to keep us honest).

So I think we have pretty free rein to experiment with error representations.

jimblandy · 2023-05-12T06:01:23Z

Thinking about this a bit more:

It may be the case that there are many consumers that don't need full access to the IR, who simply want to be able to pass in a string and get another string back out, with some basic reflection data. We certainly intend to keep the WGSL we accept backwards-compatible, for Firefox's use case where breaking existing shaders would break web sites, who often have nobody available to fix them.

For such users, it might make sense to offer an stabler-naga crate (please think of a better name) that offers a more restricted API than Naga itself and tries not to break it. There's no reason this crate couldn't keep up with the latest and greatest Naga without changing its own semver. Ideally, users like Bevy could confine themselves to stabler-naga in exchange for fewer breaking changes.

I don't know what Bevy (say) needs from Naga, though, so we don't really know what would belong in the stabler-naga API.

ErichDonGubler mentioned this issue May 2, 2023

Consolidate diagnostic presentation logic into library error types #4494

Open

4 tasks

Wumpf mentioned this issue Aug 6, 2023

Unable to handle shader compilation errors #2130

Closed

cwfitzgerald transferred this issue from gfx-rs/naga Oct 25, 2023

cwfitzgerald added the naga Shader Translator label Oct 25, 2023

ErichDonGubler self-assigned this Jan 22, 2024

ErichDonGubler added the kind: diagnostics Error message should be better label Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should Naga's error representation be the public error API? #4429

Should Naga's error representation be the public error API? #4429

ErichDonGubler commented May 2, 2023 •

edited

Loading

dannymcgee commented May 11, 2023 •

edited

Loading

jimblandy commented May 12, 2023 •

edited

Loading

jimblandy commented May 12, 2023

Should Naga's error representation be the public error API? #4429

Should Naga's error representation be the public error API? #4429

Comments

ErichDonGubler commented May 2, 2023 • edited Loading

dannymcgee commented May 11, 2023 • edited Loading

jimblandy commented May 12, 2023 • edited Loading

jimblandy commented May 12, 2023

ErichDonGubler commented May 2, 2023 •

edited

Loading

dannymcgee commented May 11, 2023 •

edited

Loading

jimblandy commented May 12, 2023 •

edited

Loading