Skip to content

Add caching file system to hive connector#13904

Merged
highker merged 1 commit intoprestodb:masterfrom
jainxrohit:rj_caching_hive
Jan 8, 2020
Merged

Add caching file system to hive connector#13904
highker merged 1 commit intoprestodb:masterfrom
jainxrohit:rj_caching_hive

Conversation

@jainxrohit
Copy link
Contributor

@jainxrohit jainxrohit commented Dec 30, 2019

== RELEASE NOTES ==

Hive Changes
* Allow reading data from HDFS while caching the fetched data on local disks. Turn on the feature by specifying the cache directory config `cache.base-directory`.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the period in the commit title? https://chris.beams.io/posts/git-commit/ is a good commit message guideline.

@jainxrohit jainxrohit changed the title Add caching file system to hive connector. Add caching file system to hive connector Jan 1, 2020
@jainxrohit
Copy link
Contributor Author

Could you remove the period in the commit title? https://chris.beams.io/posts/git-commit/ is a good commit message guideline.

Nice article, fixed the commit message.

Copy link
Contributor

@shixuan-fan shixuan-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

@jainxrohit jainxrohit changed the title Add caching file system to hive connector [WIP] Add caching file system to hive connector Jan 2, 2020
@highker
Copy link

highker commented Jan 6, 2020

The test failure is due to permission/auth setting. Try overriding the following function in CachingFileSystem

    @Override
    public void setPermission(Path path, FsPermission permission)
            throws IOException
    {
        dataTier.setPermission(path, permission);
    }

But in general, I would suggest overriding all default functions from FileSystem. A good example is FilterFileSystem

@jainxrohit jainxrohit changed the title [WIP] Add caching file system to hive connector Add caching file system to hive connector Jan 6, 2020
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coding style comments

@jainxrohit jainxrohit requested a review from highker January 6, 2020 22:59
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

@jainxrohit jainxrohit requested a review from highker January 7, 2020 01:01
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @shixuan-fan, could you give it a final review and merge it?

Copy link
Contributor

@shixuan-fan shixuan-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ideally we would want to have three commits:

  • Fixing caching file system
  • Raptor side change
  • Hive side change

But since it is already reviewed, I won't bother breaking it down. I'll merge it once we've completed the internal repo pull request that adapt to this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants