Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch objects in Branch.delete_objects calls #285

Merged
merged 1 commit into from
Jul 17, 2024
Merged

Conversation

nicholasjng
Copy link
Collaborator

lakeFS supports a maximum of 1000 objects per delete_objects() invocation, since that calls the object deletion API under the hood. This limitation is not without precedent, and exists also e.g. on AWS S3.

Hence, we batch delete objects on the client side, dispatching multiple API calls in the cases where the deleted objects list contains more than a thousand objects.

The threshold number of objects is currently hardcoded to 1000, since that is the limitation of the lakeFS server.

Fixes #284.

lakeFS supports a maximum of 1000 objects per `delete_objects()`
invocation, since that calls the object deletion API under the hood.
This limitation is not without precedent, and exists also e.g. on
AWS S3.

Hence, we batch delete objects on the client side, dispatching multiple
API calls in the cases where the deleted objects list contains more
than a thousand objects.

The threshold number of objects is currently hardcoded to 1000, since
that is the limitation of the lakeFS server.
@nicholasjng nicholasjng self-assigned this Jul 16, 2024
Copy link

codecov bot commented Jul 16, 2024

Codecov Report

Attention: Patch coverage is 78.94737% with 4 lines in your changes missing coverage. Please review.

Project coverage is 93.19%. Comparing base (361dbfc) to head (fdef99d).

Files Patch % Lines
src/lakefs_spec/util.py 66.66% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #285      +/-   ##
==========================================
- Coverage   93.99%   93.19%   -0.80%     
==========================================
  Files           5        5              
  Lines         383      397      +14     
  Branches       72       77       +5     
==========================================
+ Hits          360      370      +10     
- Misses         14       16       +2     
- Partials        9       11       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nicholasjng
Copy link
Collaborator Author

Not really interested in covering these missing lines - one is a Python standard library call (if Python is at least version 3.12), the other is a guard against the edge case of a nonpositive chunk size (which is impossible for us, but still good to have).

Interested in review opinions.

@nicholasjng nicholasjng merged commit 41254a2 into main Jul 17, 2024
6 of 7 checks passed
@nicholasjng nicholasjng deleted the delete-chunking branch July 17, 2024 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant