Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric recording built documentation size #2238

Closed
Nemo157 opened this issue Oct 2, 2023 · 5 comments · Fixed by #2644
Closed

Metric recording built documentation size #2238

Nemo157 opened this issue Oct 2, 2023 · 5 comments · Fixed by #2644
Labels
A-builds Area: Building the documentation for a crate E-medium Effort: This requires a fair amount of work

Comments

@Nemo157
Copy link
Member

Nemo157 commented Oct 2, 2023

While discussing rust-lang/rust#115718 I had a thought that being able to see historical "typical" documentation sizes could be useful, and docs.rs might be a good place to gather that.

It'll likely be a pretty varying metric depending on what crates are being built each day, maybe too variable to be useful, but I think it should be pretty easy to add and then we could see whether it seems to give useful data.

@camelid
Copy link
Member

camelid commented Oct 2, 2023

It'd be important to record the median in particular since it's less susceptible to outliers. Percentiles could be useful too.

@Nemo157
Copy link
Member Author

Nemo157 commented Oct 2, 2023

Yeah, I was thinking recording as a histogram, we've been recording build durations that way for a while now which can give some idea of the sort of volatility from the different sorts of crates that get published each day (25th, 50th, 75th, 90th, 95th, 99th percentile lines):

image

@syphar
Copy link
Member

syphar commented Oct 3, 2023

What's the use-case for this? It sounds useful.

Would "documentation size" rather be the amount of files? Megabytes? all targets?

Depending on how exact / how often / how far back to the history we need the numbers, we could also store it in the database.

@camelid
Copy link
Member

camelid commented Oct 3, 2023

Rustdoc recently had a regression (rust-lang/rust#115718) where the docs storage size of certain crates ballooned (from 27 MB to around 500 MB). So for the documentation size, perhaps it'd be the documentation size across all targets, divided by the number of targets for that crate? Storing the data in the database would probably be useful.

@syphar syphar added A-builds Area: Building the documentation for a crate E-easy Effort: Should be easy to implement and would make a good first PR E-medium Effort: This requires a fair amount of work and removed E-easy Effort: Should be easy to implement and would make a good first PR labels Feb 14, 2024
@syphar
Copy link
Member

syphar commented Feb 14, 2024

So, some thoughts here (aka, partial mentoring instructions).

  • when running the build, we have all the documentation for all targets in one folder. It might be possible to figure out target-specific doc sizes from that, though it's tricky for the default target which doesn't have a subfolder.
  • we have crates with a fairly big documentation (from what I remember stm32ral sums up to around 4 GiB compressed, with several million files, so we should perhaps at least benchmark the added build time to calculate sizes for the whole crate, or per target.
  • depending on the use-case I could imagine that just tracking size for all targets together is good enough. But also, when we start tracking some data it might also be useful to directly start per-target. Or the average per target?

generally we could store this in the database per release, and/or report it to prometheus to be able to add it to our dashboards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-builds Area: Building the documentation for a crate E-medium Effort: This requires a fair amount of work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants