Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ArtifactBundle flat file indexing #53505

Merged
merged 6 commits into from
Jul 25, 2023

Conversation

Swatinem
Copy link
Member

@Swatinem Swatinem commented Jul 25, 2023

Supercedes #52785

The high-level overview is this:

  • When uploading a new ArtifactBundle, we get_or_create the ArtifactBundleFlatFileIndex, and update_or_create the FlatFileIndexState, setting it to NOT_INDEXED.
  • Then, using a distributed lock, we read the existing ArtifactBundleFlatFileIndex, merging the ArtifactBundle into it and updating it, finally setting the FlatFileIndexState to WAS_INDEXED.

As the final step is behind a distributed lock, we can be confident that we don’t have data races.

If any error happens during this operation, the FlatFileIndexState indicates that indexing still needs to happen.
A separate "repair-job" could in the future just query all the FlatFileIndexState entries older than X that were not yet indexed, and add them to that index at a later time.

@Swatinem Swatinem requested a review from a team July 25, 2023 09:22
@Swatinem Swatinem self-assigned this Jul 25, 2023
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jul 25, 2023
@Swatinem Swatinem force-pushed the swatinem/update-bundle-index branch from e8e0535 to 40447eb Compare July 25, 2023 09:25
@iambriccardo iambriccardo self-requested a review July 25, 2023 09:45
@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

Merging #53505 (f22187c) into master (37d4a80) will increase coverage by 0.03%.
The diff coverage is 92.63%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #53505      +/-   ##
==========================================
+ Coverage   79.52%   79.55%   +0.03%     
==========================================
  Files        4941     4942       +1     
  Lines      208882   209073     +191     
  Branches    35574    35603      +29     
==========================================
+ Hits       166119   166338     +219     
+ Misses      37696    37656      -40     
- Partials     5067     5079      +12     
Files Changed Coverage Δ
src/sentry/conf/server.py 91.45% <ø> (ø)
src/sentry/tasks/assemble.py 86.24% <73.68%> (-0.85%) ⬇️
src/sentry/models/artifactbundle.py 97.81% <87.50%> (+1.41%) ⬆️
src/sentry/debug_files/artifact_bundle_indexing.py 95.55% <95.55%> (ø)
src/sentry/features/__init__.py 100.00% <100.00%> (ø)

... and 9 files with indirect coverage changes

src/sentry/debug_files/artifact_bundle_indexing.py Outdated Show resolved Hide resolved
src/sentry/debug_files/artifact_bundle_indexing.py Outdated Show resolved Hide resolved
src/sentry/debug_files/artifact_bundle_indexing.py Outdated Show resolved Hide resolved
src/sentry/debug_files/artifact_bundle_indexing.py Outdated Show resolved Hide resolved
src/sentry/tasks/assemble.py Outdated Show resolved Hide resolved
Co-authored-by: Riccardo Busetti <[email protected]>
Copy link
Member

@iambriccardo iambriccardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Swatinem Swatinem merged commit 64fba06 into master Jul 25, 2023
55 of 56 checks passed
@Swatinem Swatinem deleted the swatinem/update-bundle-index branch July 25, 2023 13:27
chloeho7 pushed a commit that referenced this pull request Jul 25, 2023
The high-level overview is this:

- When uploading a new `ArtifactBundle`, we `get_or_create` the
`ArtifactBundleFlatFileIndex`, and `update_or_create` the
`FlatFileIndexState`, setting it to `NOT_INDEXED`.
- Then, using a distributed lock, we read the existing
`ArtifactBundleFlatFileIndex`, merging the `ArtifactBundle` into it and
updating it, finally setting the `FlatFileIndexState` to `WAS_INDEXED`.

As the final step is behind a distributed lock, we can be confident that
we don’t have data races.

If any error happens during this operation, the `FlatFileIndexState`
indicates that indexing still needs to happen.
A separate "repair-job" could in the future just query all the
`FlatFileIndexState` entries older than X that were not yet indexed, and
add them to that index at a later time.

---------

Co-authored-by: Riccardo Busetti <[email protected]>
@sentry-io
Copy link

sentry-io bot commented Jul 27, 2023

Suspect Issues

This pull request has been deployed and Sentry has observed the following issues:

  • ‼️ ArtifactBundleFlatFileIndex.MultipleObjectsReturned: get() returned more than one ArtifactBundleFlatFileIndex -- it returned 2! sentry.tasks.assemble.assemble_artifacts View Issue

Did you find this useful? React with a 👍 or 👎

@github-actions github-actions bot locked and limited conversation to collaborators Aug 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants