Access pattern observation in keyspace ("pagetrace") #10275

jcsp · 2025-01-03T19:36:42Z

In INC-362 we saw strong signals that the client (compute) was getting something wrong with caching: we suspect it is re-requesting the same data repeatedly, but can't prove it.

To diagnose issues like this, we need an ability to get a raw dump of the keys touched by getpage requests.

Candidate impls:

Something that does a tcpdump and parses it
Something built into the pageserver that dumps out data for a specific tenant into some local file over some time period, for later retrieval and analysis.

erikgrinaker · 2025-01-03T19:42:59Z

In past systems, we had an API endpoint that would allow us to temporarily enable and output debug/trace logging at runtime for specific source code files with regex-filtering. So we could e.g. enable trace-logging for the getpage handler and regex-filter by tenant/shard to dump keys for 30 seconds.

Might be a simple and general solution, if our logging/tracing library supports it.

jcsp · 2025-01-03T19:46:12Z

Yeah, this should evolve into something with an API for toggling tracing per tenant (we may even have an issue for that somewhere). However, because we use grafana for logs, and that doesn't cope well with passing around big dumps, if we want to get some dump of like 100K keys to then visualize somehow, we'll probably need to output those some other way (or embrace some other system for recording results that works better than Loki)

jcsp · 2025-01-03T19:46:38Z

Aside: my favorite one of these was EMC isilon, where you could subscribe to performance metrics on a particular directory in a filesystem, good times.

erikgrinaker · 2025-01-03T19:46:42Z

Yeah, these debug events would be emitted via the API endpoint response as a stream, not via the regular log sink.

erikgrinaker · 2025-01-06T12:51:15Z

The tracing crate does indeed allow arbitrary subscriptions to the event stream. I propose we add an API route /trace which subscribes to the event stream and outputs them to the response body. Example parameters:

level: log level to emit (default DEBUG?).
seconds: number of seconds to dump events for (default 30).
regex: regular expression filter.
file: filter events by source code file path.
field[<name>]: span field filter (e.g. field[tenant_id]=foo).

Wdyt?

jcsp · 2025-01-06T13:00:26Z

I'm a little anxious about using trace+regex here, the overhead could be substantial, and we'll probably be using this in situations where we already have a performance problem.

I was thinking about maybe something designed for minimum cost, like:

A piece of state on Tenant/Timeline that controls whether to trace (i.e. for a non-traced tenant, the overhead would just be one load and one branch)
Record + output some very dense binary structure for getpage requests (e.g. a stream of records that are a 16 byte key, an 8 byte timestamp, a 4 byte runtime).
A threshold for the recording buffer to bound how much memory this can eat, e.g. 32MB should be enough for recording 1 million requests if we use a very dense encoding

erikgrinaker · 2025-01-06T13:18:07Z

Discussed offline. The performance risks of a generalized tracing endpoint appear too big for us to ship something to production for debugging in a matter of days. We'll do the simple, performant thing for now: add an API endpoint that registers a fixed-size channel for a timeline, and emits compact binary data to the client via HTTP.

jcsp changed the title ~~Access pattern observation in keyspace~~ Access pattern observation in keyspace ("pagetrace") Jan 3, 2025

erikgrinaker self-assigned this Jan 6, 2025

erikgrinaker added c/storage/pageserver Component: storage: pageserver a/observability Area: related to observability t/incident Issue type: incident in our service labels Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Access pattern observation in keyspace ("pagetrace") #10275

Access pattern observation in keyspace ("pagetrace") #10275

jcsp commented Jan 3, 2025

erikgrinaker commented Jan 3, 2025

jcsp commented Jan 3, 2025

jcsp commented Jan 3, 2025

erikgrinaker commented Jan 3, 2025 •

edited

Loading

erikgrinaker commented Jan 6, 2025 •

edited

Loading

jcsp commented Jan 6, 2025

erikgrinaker commented Jan 6, 2025

Access pattern observation in keyspace ("pagetrace") #10275

Access pattern observation in keyspace ("pagetrace") #10275

Comments

jcsp commented Jan 3, 2025

erikgrinaker commented Jan 3, 2025

jcsp commented Jan 3, 2025

jcsp commented Jan 3, 2025

erikgrinaker commented Jan 3, 2025 • edited Loading

erikgrinaker commented Jan 6, 2025 • edited Loading

jcsp commented Jan 6, 2025

erikgrinaker commented Jan 6, 2025

erikgrinaker commented Jan 3, 2025 •

edited

Loading

erikgrinaker commented Jan 6, 2025 •

edited

Loading