Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report generation is slow and prone to failure #564

Open
pietroalbini opened this issue Apr 6, 2021 · 4 comments
Open

Report generation is slow and prone to failure #564

pietroalbini opened this issue Apr 6, 2021 · 4 comments

Comments

@pietroalbini
Copy link
Member

Generating Crater reports is now really slow compared to a year or two ago, as Crater is handling an ever-increasing amount of crates it needs to test. Lately the Crater server also started crashing when generating the report of some runs. There are two main causes I see for this:

  • When generating the report tarballs of all the logs are created to aid processing the results locally. Each tarball is fully kept in memory before being persisted to S3 though, so with an higher enough number of logs the server could OOM.
  • Uploading the logs is really slow, as Crater currently uploads one file to S3 at the time. With hundreds of thousands of logs to upload this quickly becomes a problem.
@pietroalbini
Copy link
Member Author

While the proper solution would be to fix both issues (by using the filesystem as the temporary storage for archives while they're created and by uploading logs to S3 in parallel), I think there is a quicker approach that could postpone both problems.

Right now we're handling and uploading the logs for all the crates, even the ones that were not regressions. Because of that, most of the logs we upload are actually useless (like the logs for test-pass crates). If we were to change the report process to just avoid processing uninteresting logs we would save storage space and make the problems go away for a long time.

@Mark-Simulacrum
Copy link
Member

I think we should move the tarballs to disk, but I at least find it sometimes helpful to look for past successful crate builds. I wouldn't stop uploading them personally; I think the reliability issues here should be solved by moving to disk storage for the tarballs.

@pietroalbini
Copy link
Member Author

Honestly I thought test-pass + test-pass results were never looked at. If someone actually uses them it's fine to keep them!

@Mark-Simulacrum
Copy link
Member

#659 takes a start at this, by only uploading regressed crate's logs as raw files (vs. compressed tarballs). Should drastically speed things up for most crater runs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants