-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Document the hive.fs.cache.max-size property #11261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technical question .. and potentially necessary link.. is this about Hive object storage caching or something else? We should link to the relevant docs..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It is unrelated to hive object storage caching. It is the max size of the
TrinoFileSystemCachewhich is an implementation of the Hadoop interfaceFileSystemCache.This cache is used to speed up requests to get
FileSystemobjects.I don't know any docs we could link to for explaining in more detail I'm afraid.
The reason I wanted to document this is because we hit this error sometimes on long running Trino clusters:
To increase this cache size limit and avoid hitting this error, you need to change the
hive.fs.cache.max-sizeproperty.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. So it seems like there is another caching going on .. on the workers only I assume. Maybe @hashhar @findepi or others can chime in with any other relevant info we should add. From my perspective this is already a worth improvement and could be merged. but more info might be better..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not caching of data. @electrum will know more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @findepi .. we definitely need to figure out more .. and if people hit this limit often .. maybe we need to raise the default value as well ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only hit this limit once. It's not common in my experience to hit this limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's uncommon to hit this why document at all? People change values and then end up with broken setups. 🤔
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you want to at least enable the uncommon usage.. by your logic we should not have this as a modifiable setting at all @hashhar .. all our settings should have reasonable defaults and not need changing .. but we still need to enable the users that require changing a setting and document them. The alternative is that users are required to look things up in the log or the source code or just cant fix a problem.. thats much worse than having some documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default is "reasonable" since according to Padraig he's only ever needed to do it once.
I'm fine with documenting this but with the number of advanced toggles we have (either for testing or people who know what they are doing) documenting them all would just increase burden on both readers of the docs and the people keeping them up to date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.. but thats what it takes .. documentation should be complete .. all alternatives are worse