-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Allow different jdbc metadata cache ttls for schema and tables lists #16090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow different jdbc metadata cache ttls for schema and tables lists #16090
Conversation
|
This is making configuration very complex over time. What is the actual problem we are trying to solve? Do we want to notice newly created objects faster? Or is the goal something else? |
To me it sounds like this is the solution we should implement - it would solve the problem in a simpler way and also make the config apply consistently. |
How should that work? The list of objects is not a 0/1 situation. With a table it either exists or it doesn't. A list is a list. It doesn't show up when the table is created, but it is changed. |
66c39ac to
0b35c0e
Compare
skrzypo987
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some tests failing.
Other than that looks good
434709d to
677d14f
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this constructor use the other one at all places ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those options are now non-optional and will have to be equal to metadataCachintTtl in such cases. I think it's easier to have an overloaded constructor to indicate that.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split this into two methods - one for table and other for schema ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test ensures that schema and table names cache ttls are different, it won't do that if they are separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test method is quite big and it does assert multiple operations like with and without cache. Or we could have schema name cache, table name cache and a test method with combined configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those calls are inter-dependendent. First of all the test ensure that schema and table caches are different, an then it tests the evolution of the cache over time. I can split it into separate 'stages' which should make it more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split it into multiple smaller sections and extracted to a separate assertions class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a DataProvider in this case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the two values for boolean warrant a DataProvider, and it's not really data, it's two modes of operations and they should be expressed explcitly.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
677d14f to
a04744f
Compare
a04744f to
0f7254c
Compare
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test method is quite big and it does assert multiple operations like with and without cache. Or we could have schema name cache, table name cache and a test method with combined configuration.
0f7254c to
edcd11f
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a DataProvider in this case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since cacheMissing works only at a table level - do we need to mix up with the schema related tests. We could configure some default value of schema cache ttl and create a JDBC client. WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm testing a usecase where a newly added table gets reflected in metadata lists faster than the general table cache ttl.
f25eeb0 to
0648a59
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are asserting only the schema and table names do we need this assertion related to tableHandleByName.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed to make sure that metadata caches have separate ttl from individual table caches.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
|
@atanasenko please fork with @sopel39 to ensure this is consistent with recent Hive connector metadata caching changes design |
0648a59 to
d69f36b
Compare
d69f36b to
1b25265
Compare
Praveen2112
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Outdated
Show resolved
Hide resolved
* Extract assertTableHandleByNameCache and assertTableHandleByQueryCache * Align enum import with other methods
1b25265 to
bb45631
Compare
hashhar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skimmed, will do another pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to cap this to be lower than metadata cache ttl? Listing operations will see a table but there would be no metadata cached for it and the table might've been dropped already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lists and specific metadata objects are loaded at different times too, sot this scenario can still happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashhar Theoretically, there could be some situations when users want to cache lists of schemas/tables for a longer amount of time (very big number of entries which do not change over time), but still let individual table data get reloaded.
This is a generic config after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point.
| assertTableNamesCache(cachingJdbcClient) | ||
| .misses(1) | ||
| .loads(1) | ||
| .afterRunning(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's preexisting, but
Should/can lambda have a parameter client which is received from assertTableNamesCache cachingJdbcClient?
Missing table metadata can be configured to not get cached (`cacheMissing`), but that doesn't apply to schema and table name lists. In certain cases it might be beneficial to shorten the list cache ttl for them to reflect the added schemas/tables faster, while specific table caches will automatically reload tables that were missing before.
bb45631 to
215ed86
Compare
hashhar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % whether we want to cap the list cache TTL to be lower than metadata cache ttl
hashhar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from me.
|
Does this need release notes? cc @kokosing |
|
Yes, thank you for asking. CC: @atanasenko |
Description
Missing table metadata can be configured to not get cached (
cacheMissing), but that doesn't apply to schema and table name lists.In certain cases it might be beneficial to shorten the list cache ttl for them to reflect the added schemas/tables faster, while specific table caches will automatically reload tables that were missing before.
Additional context and related issues
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: