Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data layout optimization (strategy generation). Part 3: compaction strategy generation with cost/gain scores #116

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

teamurko
Copy link
Collaborator

@teamurko teamurko commented Jun 4, 2024

Summary

This is part 3 of a new feature: data layout optimization library, strategy generation. This PR is co-authored with @anjagruenheid.

Added compaction strategy generation with rewrite cost as serial rewrite time and rewrite gain as time-saving from number of files reduced. This PR builds on top of #109

The following 3 components will be added eventually:

  1. DLO library that has primitives for generating data layout optimization strategies
  2. App that generates strategies for all tables
  3. Scheduling of the app

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

For all the boxes checked, include additional details of the changes made in this pull request.

@teamurko teamurko changed the title Generate compaction strategy Data layout optimization (strategy generation). Part 3: compaction strategy generation with cost/gain scores Jun 4, 2024
@teamurko teamurko marked this pull request as ready for review June 4, 2024 20:31
@sumedhsakdeo
Copy link
Collaborator

Review posted on teamurko#2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants