Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce AllocateTableStorage in Storage interface #119

Merged
merged 8 commits into from
Jun 7, 2024

Conversation

HotSushi
Copy link
Collaborator

@HotSushi HotSushi commented Jun 6, 2024

Summary

We introduce a new method in the storage interface as follows:

Storage {
  String allocateTableSpace(...)
}

which returns the tablelocation where table objects will be created.

Implementations of this api should also execute one time operations such as creating buckets, assigning permissions etc.

Please note the comment:
this method should avoid creating TableFormat specific directories (ex: /data, /metadata for Iceberg or _delta_log for DeltaLake),it describes what shouldn't be done as part of this implementation.

Layering with internal implementations

Internal implementations such as LiHdfsStorage will now have flexibiliity to do one-time operations such as table provisioning.

Changes

  • New Features
  • Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

  • Existing tests run ✅
  • Gradle build succeeds ✅
  • E2E local tests work as well ✅

@HotSushi
Copy link
Collaborator Author

HotSushi commented Jun 6, 2024

cc: @sumedhsakdeo , @jainlavina , @ctrezzo

Copy link
Collaborator

@sumedhsakdeo sumedhsakdeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks good, had one question about skipProvisioning

Copy link
Collaborator

@ctrezzo ctrezzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work!

I left some comments mainly centered around how we encapsulate the table location manipulation logic.

I have scar tissue from the MapReduce code and converting between Paths, URIs, URLs and having that logic scattered everywhere, so sorry in advance if I am focused too much on this.

@HotSushi HotSushi changed the title Introduce AllocateTableSpace in Storage interface Introduce AllocateTableStorage in Storage interface Jun 7, 2024
Copy link
Collaborator

@ctrezzo ctrezzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you for providing the extra clarifying comments around issue #121 . I think it makes sense that the logic will be contained in BaseStorage once this issue is fixed and the refactor around allocateTableLocation is complete.

@HotSushi HotSushi merged commit 82242fb into linkedin:main Jun 7, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants