Skip to content

Conversation

@BlakeOrth
Copy link
Contributor

Which issue does this PR close?

N/A -- This PR is a POC meant for discussion to inform decisions related to

Rationale for this change

This PR is to share the code of some benchmarks for reproduction of results and discussion of results.

What changes are included in this PR?

This PR includes a set of benchmarks that exercise the pruned_partition_list method (both the original (current main) and the list_all (PR)) to allow us to make a more informed decision on the potential path(s) forward related to #18146.

This code is not intended for merge.

Please generally avert your eyes from the benchmark code, because it comes with an 🚨 🤖 AI generated code 🤖 🚨 warning. There's a bunch of really silly decisions the robot made and if we actually wanted to introduce permanent benchmarks we'd likely want to pare down the cases and re-write them. At the current time I am more interested in exploring results rather than nit-picking benchmark code, however I did ensure the actual timing loops were as tight as possible to ensure the results are trustworthy representations for both implementations.

The benchmarks themselves include both an in-memory benchmark and an S3 benchmark that uses a local MinIO via testcontainers. The in-memory benches are more-or-less what I started with, and at this point are mostly there because they're academically interesting. The S3 benchmarks are necessary to truly understand the end-user performance for list operations because list operations for commercial object stores are paged with 1000 results per page. To add an additional dose of realism, the results included with this PR were run using a simulated latency of 120ms applied to my localhost interface using tc in linux. Each underlying partition structure benchmark is run twice for each implementation, once to collect all the results from the list operation, and again to collect the time-to-first-result (TTFR) from the file stream.

In order to better facilitate discussion on the results I have included both "raw" criterion results as text and a formatted table of the results as a markdown doc that's a bit easier to read. The "raw" criterion results are edited to remove some of the text-noise (improve/regression output that was not useful/accurate, warmup text etc.) and have had separators added just to make them a bit easier to navigate/digest. I think using in-line comments on the various table entries in s3_results_formatted.md is probably the easiest way to thread the discussion around the results, but I'm happy to facilitate other options.

Are these changes tested?

They are tests.

Are there any user-facing changes?

no

cc @alamb
I was initially planning on adding some comments of my own interpretations to the benchmark results right after I submitted this PR to start the discussion, but in some sense I don't want the "poison the well" of additional perspectives. If you'd like me to start the discussion/interpretation I'd be happy to do so, just let me know and I can add my current thoughts.

Additional Notes:

If anyone wants to try these locally using the simulated latency, you can use this command (run as root):

tc qdisc add dev lo root handle 1:0 netem delay 60msec

This adds 60ms to each access across the lo device, resulting in a 120ms round trip latency.

Since you're unlikely to want latency on localhost forever, you can reset it:

tc qdisc del dev lo root

I'm not sure if this functionality or similar exists on MacOS, and I don't believe there is a Windows solution.

@github-actions github-actions bot added the catalog Related to the catalog crate label Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

catalog Related to the catalog crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant