-
-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plugins: Improve error handling on plugin version mismatch #1838
plugins: Improve error handling on plugin version mismatch #1838
Conversation
and do not crash the application on failure to receive a render update from plugins any longer. Instead, will print a simple string with a hint to check the application logs, where a more thorough error indication can be found.
to create ad-hoc errors with custom error types, without having to wrap them into a `context()` before to turn the into anyhow errors.
and terminate execution with a descriptive error message in case the plugin version is incompatible with the version of zellij being run.
so the user knows which plugin to look at in case they're using custom plugins.
Previously we would accept cases where the plugin version was newer than the zellij version, which doesn't make a lot of sense.
in call to `wasmer::Function::call` in case a plugin version mismatch can occur.
that will print the panic message to a plugins stdout and then call a panic handler on the host that turns it into a real application-level panic.
and turn them into proper panics. These errors are symptomatic of an uncaught plugin version mismatch, for example when developing from main and compiling zellij/the plugins from source. Normal users should never get to see this error.
for anyhow errors. The default anyhow error formatting of `{:?}` is already very good, and we just made it worse by trying to invent our own formatting.
TODO
|
Improve error handling on plugin version mismatch.
@imsnif Are you fine with the new error formats posted above? You needn't read all the text, but you can skim the text in the code fences. |
@har7an - skimming over the code blocks, this looks great. Minor nitpick - I think it looks better if the error voice is passive "the plugins do not match the current Zellij version". Otherwise - pretty awesome. Will this happen for any sort of deserialization error? |
It will happen for all unhandled panics in the plugins. The "version mismatch" message in particular will be shown for all deserialization failures in the call to |
This PR picks up a long-standing issue with respect to the plugin system. Since we do not make guarantees about plugin version compatibility, we always require the plugin versions to match the zellij version.
Current situation
This is usually not an issue for people that only install zellij from official releases, because in this case zellij takes care of updating the plugins proper itself. Things start to get hairy when developing zellij, because once a user has built the plugins themselves from
main
, they will not be updated on the next release automatically.Previously we had no mechanism to check for this: The plugins were just loaded and executed. Errors always cropped up in the plugins
update
functions, where anEvent
is deserialized from stdin on the plugin side (i.e. in the WASM sandbox/VM). At this point, the serde error isunwrap
ped, and the plugin crashes along with the application.Symptomatic for this sort of failure are multiple things:
unwrap
inside the plugin is masked byWASM
, which is evident from the logs where messages like this would occur (Note the log level "DEBUG"):RuntimeError
without any further information.2 above is particularly annoying because it's the only hint the user has to go on and it starts with
Trap(UnreachableCodeReached)
. Not helpful.Improving the situation
This PR approaches the problem in a number of ways.
Don't mask the underlying error cause
#1827 laid the groundwork for this PR, in that it first dealt with the issue of version mismatches and reporting them. This was achieved by appending a descriptive error message before unwrapping on a failed call to
update
a plugin. Unfortunately, in the stdout that was shown to the user, this error was frequently covered by a related crash happening in thescreen
thread when trying torender
the offending plugins pane. What happens there is that we dispatch a request to the plugin to render its output, and expect to receive a message here. However, the plugin may already have crashed at this point, closing thempsc::Channel
we're trying to receive from.What we now do instead is:
RecvError
,This keeps the
screen
thread from crashing and allows us to display the real error cause on stdout.Check zellij/plugin version compatibility
Furthermore, in order to prevent such situations from happening in the first place, we're now embedding a plugin version into the plugins at build time. As of currently, this is the zellij version (taken from
zellij_utils::consts::VERSION
), taken from the version in theCargo.toml
.Since we're not making guarantees about backwards compatibility yet, the requirement is that Plugin version == Zellij version. We check the versions when starting the plugin (after loading it, but before updating it for the first time). If this check fails, we output an error such as this:
In this case, I tried to run
zellij
compiled from main against plugins compiled on v0.31.4. Since the old plugins don't have any version information, we show the plugin version as "Unavailable".Catching mismatches during development
The previous method of catching errors works wonderfully for users that just update their application from release to release. However, developers compiling zellij off of main will not update the zellij application version after every compilation. This is particularly tricky, as one may spend multiple hours developing on the latest main before performing commits, so a comparison such as "zellij git-rev" == "plugin git-rev" won't do the trick either. However, every modification increases the chance of a change in the
Event
serialization structure, making the plugins incompatible.As mentioned above, the typical symptom tor this error is that deserialization of the
Event
on the plugin fails. At this point it would have been amazing if wasmer could propagate errors from an exported WASM function called inside the plugin onto the host. The WASM docs do not talk about this at all, and it isn't considered an error when one does change the called function inside the WASM module (plugin) to return aResult
. As far as I could gather, however, thisResult
is silently dropped.It is stated somewhere on the wasmer GitHub that a
panic
inside a WASM module will be converted to aRuntimeError
, and there is even aRuntimeError::downcast
, but that also doesn't give any hints about the underlying error cause.Instead, we're overriding the panic hook to send a message (as string) to the host. Overriding the panic hook is okay in this case, because the default wasmer panic hook in our case doesn't contribute any meaningful information (likely due to
wasm-opt
used during building). Also, sending the panic message as string is the only sensible way to go: Since a panic is likely to occur upon deserialization, we obviously cannot expect to match the hosts data types any more. Hence, serializing to some other type and deserializing on the host successfully likely isn't possible at all.Here's the resulting error for this error case:
The error in this case was provoked by introducing a new key "Foo", that the plugins didn't know about. As an added bonus compared to the previous situation, we now also get to see why exactly the deserialization failed in the output of stdout (no more hiding as "DEBUG" message in the logs).