Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce WAPM query caching to reduce wasmer run's startup delay #3983

Merged
merged 6 commits into from
Jun 16, 2023

Conversation

Michael-F-Bryan
Copy link
Contributor

We've discovered that there is a ~1 second delay on startup when running something like wasmer run python/python. After some investigation with @dynamite-bud, we narrowed it down to a GraphQL query which will be executed every single time we try to fetch metadata from something that looks like a WAPM package.

tracing::trace!(%body, "Sending GraphQL query");
let request = HttpRequest {
url: self.registry_endpoint.clone(),
method: Method::POST,
body: Some(body.into_bytes()),
headers: headers(),
options: Default::default(),
};
let response = self.client.request(request).await?;
if !response.is_ok() {
let url = &self.registry_endpoint;
let status = response.status;
anyhow::bail!("\"{url}\" replied with {status}");
}
let body = response.body.unwrap_or_default();
tracing::trace!(
body=?String::from_utf8_lossy(&body),
"Received a response from GraphQL",
);
let response: WapmWebQuery =
serde_json::from_slice(&body).context("Unable to deserialize the response")?;

The simplest way to solve this is by caching the response from this sort of GraphQL query, marking the cached value as stale if it's older than (for example) 10 minutes or we've just run wasmer publish. It'd be nice if we could use a sqlite database for the cache, but I'm not sure how that'd go on the web.

Fixes #3962

@Michael-F-Bryan Michael-F-Bryan added this to the v4.0 milestone Jun 9, 2023
@Michael-F-Bryan
Copy link
Contributor Author

I've added a filesystem cache in front of WapmSource. Now it'll cache the response of a GraphQL query to $WASMER_DIR/queries/ as a JSON file (e.g. looking up the available versions for python/python would save to $WASMER_DIR/queries/python/python).

With this system, a cache miss from Australia takes about 800ms to query the WAPM backend (time.busy=19.2ms time.idle=765ms) and a cache hit takes less than a millisecond (time.busy=689µs time.idle=11.4µs).

@Michael-F-Bryan Michael-F-Bryan marked this pull request as ready for review June 15, 2023 13:43
Copy link
Contributor

@theduke theduke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once concern here: wasmer publish xxx, wasmer run xxx will run the old version.

Should we purge the cache for a package when publish is run?

I know this kind of inter-mingled logic is suboptimal, but I think it's an annoying UX gotcha otherwise.

lib/wasix/src/runtime/resolver/wapm_source.rs Outdated Show resolved Hide resolved
@Michael-F-Bryan Michael-F-Bryan merged commit 8179d2e into master Jun 16, 2023
@Michael-F-Bryan Michael-F-Bryan deleted the startup-delay branch June 16, 2023 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 second delay when starting wasmer run
3 participants