perf: experiment with bulk instert test results #319

giovanni-guidini · 2024-03-13T09:45:41Z

These changes have 2 obejctives:

(duh) try to speed up time of writing tests and test instances in the DB.
This is inspired by https://docs.sqlalchemy.org/en/13/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow
The idea is to bulk_update and let the DB handle the concurrency and the conflicts.
Test using Sentry metrics + Features for perf potential improvements.
Although I suspect the time benefit will be enough for us to prefer the bulk_insert technique I don't know what the benefit is (potentially). And there's the drawback of using extra memory, which I'm also not sure how much extra memory it is.
So I want to use this opportunity to explore this setup of running perf experiments :D

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

These changes have 2 obejctives: 1. (duh) try to speed up time of writing tests and test instances in the DB. This is inspired by https://docs.sqlalchemy.org/en/13/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow The idea is to bulk_update and let the DB handle the concurrency and the conflicts. 2. Test using Sentry metrics + Features for perf potential improvements. Although I suspect the time benefit will be enough for us to prefer the bulk_insert technique I don't know what the benefit is (potentially). And there's the drawback of using extra memory, which I'm also not sure how much extra memory it is. So I want to use this opportunity to explore this setup of running perf experiments :D

sentry-io · 2024-03-13T09:45:49Z

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: tasks/test_results_processor.py

Function	Unhandled Issue
`process_individual_upload`	FileNotInStorageError: File test_results/v1/raw/2024-03-12/25D3451FB922A5B3C2F2E4A374E5B8F0/f7475b19e0a2366ab6d57730d0f0... ... `Event Count:` 1

_{Did you find this useful? React with a 👍 or 👎}

codecov-qa · 2024-03-13T09:50:36Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.12%. Comparing base (98df06e) to head (b6ba449).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #319   +/-   ##
=======================================
  Coverage   98.12%   98.12%           
=======================================
  Files         385      385           
  Lines       31901    31959   +58     
=======================================
+ Hits        31302    31360   +58     
  Misses        599      599

Flag	Coverage Δ
integration	`98.12% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`98.12% <100.00%> (+<0.01%)`	⬆️
unit	`98.12% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.24% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.91% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
rollouts/__init__.py	`100.00% <100.00%> (ø)`
tasks/test_results_processor.py	`99.40% <100.00%> (+0.15%)`	⬆️
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <100.00%> (ø)`

codecov-public-qa · 2024-03-13T09:50:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (98df06e) 98.12% compared to head (b6ba449) 98.12%.

@@           Coverage Diff           @@
##             main     #319   +/-   ##
=======================================
  Coverage   98.12%   98.12%           
=======================================
  Files         385      385           
  Lines       31901    31959   +58     
=======================================
+ Hits        31302    31360   +58     
  Misses        599      599

Flag	Coverage Δ
integration	`98.12% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`98.12% <100.00%> (+<0.01%)`	⬆️
unit	`98.12% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.24% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.91% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
rollouts/__init__.py	`100.00% <100.00%> (ø)`
tasks/test_results_processor.py	`99.40% <100.00%> (+0.15%)`	⬆️
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <100.00%> (ø)`

codecov · 2024-03-13T09:53:28Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.11%. Comparing base (98df06e) to head (b6ba449).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #319   +/-   ##
=======================================
  Coverage   98.10%   98.11%           
=======================================
  Files         416      416           
  Lines       32601    32659   +58     
=======================================
+ Hits        31984    32042   +58     
  Misses        617      617

Flag	Coverage Δ
integration	`98.12% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`98.12% <100.00%> (+<0.01%)`	⬆️
unit	`98.12% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.18% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.91% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
rollouts/__init__.py	`100.00% <100.00%> (ø)`
tasks/test_results_processor.py	`99.40% <100.00%> (+0.15%)`	⬆️
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <100.00%> (ø)`

Related Entrypoints
run/app.tasks.test_results.TestResultsProcessor

joseph-sentry

changes look good, I had trouble verifying the memory usage gathering locally but I think we can see if that problem persists in prod

joseph-sentry · 2024-03-13T15:35:15Z

tasks/test_results_processor.py

+        # Obviously this is a very rough estimate of sizes. We are interested more
+        # in the difference between the insert approaches. SO this should be fine.
+        # And these aux memory structures take the bulk of extra memory we need
+        memory_used += getsizeof(test_data) // 1024


I tested this out locally and the memory_used i was getting was 0, I'm not sure if this is just a local thing I ran into or if there's a problem with this approach, I'm okay with trying it out to see if it works out in prod, since that's just my experience running it locally

Depending on the number of tests that you tested against it would be small enough in size that it would return 0. After all it's Kb and the integer division will round down.

Before adding the // 1024 I was getting value for those calls, so... maybe that 🤷

giovanni-guidini requested review from joseph-sentry and matt-codecov March 13, 2024 09:54

joseph-sentry approved these changes Mar 13, 2024

View reviewed changes

giovanni-guidini and others added 2 commits March 13, 2024 17:00

Merge branch 'main' into gio/experiment-bulk-insert-testinstances

047d5b7

features interface changed

b6ba449

giovanni-guidini force-pushed the gio/experiment-bulk-insert-testinstances branch from 878cb98 to b6ba449 Compare March 13, 2024 16:19

giovanni-guidini merged commit 7ea10af into main Mar 13, 2024
26 checks passed

giovanni-guidini deleted the gio/experiment-bulk-insert-testinstances branch March 13, 2024 16:33

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: experiment with bulk instert test results #319

perf: experiment with bulk instert test results #319

giovanni-guidini commented Mar 13, 2024

sentry-io bot commented Mar 13, 2024

codecov-qa bot commented Mar 13, 2024 •

edited

Loading

codecov-public-qa bot commented Mar 13, 2024 •

edited

Loading

codecov bot commented Mar 13, 2024 •

edited

Loading

joseph-sentry left a comment

joseph-sentry Mar 13, 2024

giovanni-guidini Mar 13, 2024

perf: experiment with bulk instert test results #319

perf: experiment with bulk instert test results #319

Conversation

giovanni-guidini commented Mar 13, 2024

Legal Boilerplate

sentry-io bot commented Mar 13, 2024

🔍 Existing Issues For Review

codecov-qa bot commented Mar 13, 2024 • edited Loading

Codecov Report

codecov-public-qa bot commented Mar 13, 2024 • edited Loading

Codecov Report

codecov bot commented Mar 13, 2024 • edited Loading

Codecov Report

joseph-sentry left a comment

Choose a reason for hiding this comment

joseph-sentry Mar 13, 2024

Choose a reason for hiding this comment

giovanni-guidini Mar 13, 2024

Choose a reason for hiding this comment

codecov-qa bot commented Mar 13, 2024 •

edited

Loading

codecov-public-qa bot commented Mar 13, 2024 •

edited

Loading

codecov bot commented Mar 13, 2024 •

edited

Loading