Skip to content

Document the hive.fs.cache.max-size property#11261

Merged
hashhar merged 1 commit intotrinodb:masterfrom
posulliv:hive-fs-cache-docs
Jul 28, 2022
Merged

Document the hive.fs.cache.max-size property#11261
hashhar merged 1 commit intotrinodb:masterfrom
posulliv:hive-fs-cache-docs

Conversation

@posulliv
Copy link
Contributor

@posulliv posulliv commented Mar 1, 2022

Description

Is this change a fix, improvement, new feature, refactoring, or other?

Documentation update to document a property.

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

No code changes.

How would you describe this change to a non-technical end user or system administrator?

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
(X ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(X ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Mar 1, 2022
@posulliv posulliv requested a review from mosabua March 1, 2022 17:54
@posulliv posulliv force-pushed the hive-fs-cache-docs branch from 8fb0dd3 to d7d2dd4 Compare March 1, 2022 17:55
@posulliv posulliv added the docs label Mar 1, 2022
@posulliv posulliv force-pushed the hive-fs-cache-docs branch from d7d2dd4 to 1580c85 Compare March 1, 2022 19:09
@posulliv posulliv force-pushed the hive-fs-cache-docs branch from 1580c85 to d6ca2cd Compare March 1, 2022 19:10
@posulliv posulliv requested a review from mosabua March 1, 2022 19:10
with a leading 0. If set to 'skip', permissions of newly
created directories will not be set by Trino.

``hive.fs.cache.max-size`` Maximum number of cached file system objects. 1000
Copy link
Member

@mosabua mosabua Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technical question .. and potentially necessary link.. is this about Hive object storage caching or something else? We should link to the relevant docs..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. It is unrelated to hive object storage caching. It is the max size of the TrinoFileSystemCache which is an implementation of the Hadoop interface FileSystemCache.

This cache is used to speed up requests to get FileSystem objects.

I don't know any docs we could link to for explaining in more detail I'm afraid.

The reason I wanted to document this is because we hit this error sometimes on long running Trino clusters:

io.trino.spi.TrinoException: FileSystem max cache size has been reached: 1000

To increase this cache size limit and avoid hitting this error, you need to change the hive.fs.cache.max-size property.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. So it seems like there is another caching going on .. on the workers only I assume. Maybe @hashhar @findepi or others can chime in with any other relevant info we should add. From my perspective this is already a worth improvement and could be merged. but more info might be better..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not caching of data. @electrum will know more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @findepi .. we definitely need to figure out more .. and if people hit this limit often .. maybe we need to raise the default value as well ..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only hit this limit once. It's not common in my experience to hit this limit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's uncommon to hit this why document at all? People change values and then end up with broken setups. 🤔

Copy link
Member

@mosabua mosabua Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you want to at least enable the uncommon usage.. by your logic we should not have this as a modifiable setting at all @hashhar .. all our settings should have reasonable defaults and not need changing .. but we still need to enable the users that require changing a setting and document them. The alternative is that users are required to look things up in the log or the source code or just cant fix a problem.. thats much worse than having some documentation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default is "reasonable" since according to Padraig he's only ever needed to do it once.

I'm fine with documenting this but with the number of advanced toggles we have (either for testing or people who know what they are doing) documenting them all would just increase burden on both readers of the docs and the people keeping them up to date.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.. but thats what it takes .. documentation should be complete .. all alternatives are worse

@findepi findepi requested a review from electrum March 4, 2022 09:31
@colebow colebow requested a review from mosabua July 8, 2022 16:09
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't know if this is very helpful as it stands but it is definitely better than not having the property documented.

@mosabua
Copy link
Member

mosabua commented Jul 26, 2022

Can we get this merged?

@hashhar hashhar merged commit 627e395 into trinodb:master Jul 28, 2022
@github-actions github-actions bot added this to the 392 milestone Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants