Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage/dataflux): add worksteal algorithm to fast-listing #10913

Merged
merged 12 commits into from
Sep 29, 2024

Conversation

akansha1812
Copy link
Contributor

@akansha1812 akansha1812 commented Sep 25, 2024

feat: add worksteal algorithm to fast-listing
Dataflux fast-listing will be used to quickly list objects in a bucket in parallel leveraging worksteal algorithm.

Worksteal algorithm splits a given namespace into multiple ranges for multiple workers(goroutines) to list objects in gcs bucket in parallel.

Fixes #10731

@akansha1812 akansha1812 requested review from a team as code owners September 25, 2024 00:10
Copy link

conventional-commit-lint-gcf bot commented Sep 25, 2024

🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use automerge label. Good luck human!

-- conventional-commit-lint bot
https://conventionalcommits.org/

@product-auto-label product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Sep 25, 2024
@akansha1812 akansha1812 changed the title feat(storage/dataflux): adding worksteal algorithm for listing feat(storage/dataflux): add worksteal algorithm to fast-listing Sep 25, 2024
Copy link
Contributor

@BrennaEpp BrennaEpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts. This should be tested for race conditions as well but that can be as integration tests in the follow up.

storage/dataflux/fast_list.go Outdated Show resolved Hide resolved
storage/dataflux/fast_list.go Show resolved Hide resolved
storage/dataflux/fast_list.go Show resolved Hide resolved
storage/dataflux/fast_list.go Show resolved Hide resolved
storage/dataflux/fast_list.go Outdated Show resolved Hide resolved
storage/dataflux/worksteal.go Show resolved Hide resolved
storage/dataflux/worksteal.go Outdated Show resolved Hide resolved
storage/dataflux/worksteal.go Show resolved Hide resolved
storage/dataflux/worksteal.go Outdated Show resolved Hide resolved
storage/dataflux/worksteal.go Outdated Show resolved Hide resolved
storage/dataflux/fast_list_test.go Outdated Show resolved Hide resolved
storage/dataflux/worksteal.go Show resolved Hide resolved
storage/dataflux/worksteal.go Show resolved Hide resolved
internal/testutil/context.go Outdated Show resolved Hide resolved
storage/dataflux/worksteal.go Outdated Show resolved Hide resolved
storage/dataflux/worksteal.go Outdated Show resolved Hide resolved
Copy link
Contributor

@BrennaEpp BrennaEpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@BrennaEpp BrennaEpp merged commit 015b52c into googleapis:main Sep 29, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: implement dataflux fast listing
2 participants