Skip to content

Support using remote HDFS as storage in Raptor#13535

Merged
highker merged 1 commit intoprestodb:masterfrom
jessesleeping:raptor-hdfs
Oct 22, 2019
Merged

Support using remote HDFS as storage in Raptor#13535
highker merged 1 commit intoprestodb:masterfrom
jessesleeping:raptor-hdfs

Conversation

@jessesleeping
Copy link
Contributor

@jessesleeping jessesleeping commented Oct 11, 2019

TBD

== RELEASE NOTES ==

Raptor Changes
* Change `storage.data-directory` from path to URI. For existing deployment on local flash, a scheme header "file://" should be added to the original config value.
* Change error code name `RAPTOR_LOCAL_FILE_SYSTEM_ERROR` to `RAPTOR_FILE_SYSTEM_ERROR`.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high-level comments

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this config. Use URI dataDirectory as the only source of truth to determine the FS:

  1. In FileSystemModule, Use dataDirectory.getScheme() to decide what FS module to install. Local file should be configured as "file:///raptor" (https://en.wikipedia.org/wiki/File_URI_scheme).
  2. Get the value of dataDirectory and convert it into Path baseLocation. Use it everywhere. Ideally we should avoid using File as much as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking using URI scheme as "FileSystemModule" selector is not very feasible. For example, under "HdfsStorageModule" mode, we can support URI schemes for both "hdfs" and "local". It's a many to many mapping so I suggest we still use a separate config to indicate what file system we are going to use.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you hide fileSystem.rename(stagingFile, storageFile); in OrcStorageManager inside StorageService with a new interface? (e.g., promoteFromStagingToStorage). We can hide the rename logic all in StorageService.

@jessesleeping jessesleeping force-pushed the raptor-hdfs branch 2 times, most recently from 19dd15e to 6b96e02 Compare October 21, 2019 21:23
@jessesleeping jessesleeping requested a review from highker October 21, 2019 21:23
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; the test failure looks related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants