Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data layout optimization (strategy generation). Part 2: data source for statistics/query logs #109

Merged
merged 2 commits into from
Jun 20, 2024

Conversation

teamurko
Copy link
Collaborator

@teamurko teamurko commented May 27, 2024

Summary

This is part 2 of a new feature: data layout optimization library, strategy generation.
Added data source interface/implementation. This PR builds on top of #108

The following 3 components will be added eventually:

  1. DLO library that has primitives for generating data layout optimization strategies
  2. App that generates strategies for all tables
  3. Scheduling of the app

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

For all the boxes checked, include additional details of the changes made in this pull request.

Copy link
Collaborator

@sumedhsakdeo sumedhsakdeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create this PR on top of part1 as the base? Thanks.

sumedhsakdeo
sumedhsakdeo previously approved these changes Jun 18, 2024
Copy link
Collaborator

@sumedhsakdeo sumedhsakdeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, clean PR easy to follow. onto PR3 now.

@teamurko teamurko merged commit 397e483 into linkedin:main Jun 20, 2024
1 check passed
@teamurko teamurko mentioned this pull request Jun 21, 2024
17 tasks
teamurko added a commit that referenced this pull request Aug 2, 2024
…rategy generation with cost/gain scores (#116)

## Summary
This is part 3 of a new feature: data layout optimization library,
strategy generation. This PR is co-authored with @anjagruenheid.

Added compaction strategy generation with rewrite cost as serial rewrite
time and rewrite gain as time-saving from number of files reduced. This
PR builds on top of #109

The following 3 components will be added eventually:
1) DLO library that has primitives for generating data layout
optimization strategies
2) App that generates strategies for all tables
3) Scheduling of the app

## Changes

- [ ] Client-facing API Changes
- [ ] Internal API Changes
- [ ] Bug Fixes
- [x] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [ ] Refactoring
- [ ] Documentation
- [x] Tests

For all the boxes checked, please include additional details of the
changes made in this pull request.

## Testing Done
<!--- Check any relevant boxes with "x" -->

- [ ] Manually Tested on local docker setup. Please include commands
ran, and their output.
- [x] Added new tests for the changes made.
- [ ] Updated existing tests to reflect the changes made.
- [ ] No tests added or updated. Please explain why. If unsure, please
feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in
production. Please explain.

For all the boxes checked, include a detailed description of the testing
done for the changes made in this pull request.

# Additional Information

- [ ] Breaking Changes
- [ ] Deprecations
- [ ] Large PR broken into smaller PRs, and PR plan linked in the
description.

For all the boxes checked, include additional details of the changes
made in this pull request.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants