Fix TableHandles with ColumnHandles caching#14751
Fix TableHandles with ColumnHandles caching#14751ssheikin wants to merge 2 commits intotrinodb:masterfrom
Conversation
kokosing
left a comment
There was a problem hiding this comment.
Changes to me looks reasonable.
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Is it possible to provide a test?
There was a problem hiding this comment.
@raunaqmorarka @sopel39 could you please advice which connector invokes PruneTableScanColumns#pruneColumns e.g. running BaseConnectorTest#testDeleteWithSubquery
There was a problem hiding this comment.
That rule should trigger whenever there are more columns output from the table scan than are used by the rest of the query plan. E.g. select nationkey, count(*) from (select * from nation) group by 1 in tpch.tiny
There was a problem hiding this comment.
@ssheikin it looks like a bug fix. do you have a query reproducing the problem?
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Outdated
Show resolved
Hide resolved
9063668 to
ce1ad4b
Compare
Table handle contains column handles. When column handle is changed, data in cache for table handles is outdated. Test TableHandles cache invalidation on columns change.
DeleteRowIdColumnHandle does not belong to the table handle however and is not designed to belong to it. This column handle required during analysis phase. The column is used for row-level delete.
ce1ad4b to
a54d488
Compare
findepi
left a comment
There was a problem hiding this comment.
"Invalidate caches more coarsely"
| { | ||
| delegate.setColumnComment(session, handle, column, comment); | ||
| invalidateColumnsCache(handle.asPlainTable().getSchemaTableName()); | ||
| invalidateTableCaches(handle.asPlainTable().getSchemaTableName()); |
There was a problem hiding this comment.
cmt msg
Invalidate caches more coarsely
Table handle contains column handles. When column handle is changed,
data in cache for table handles is outdated.
I'd suggest title like "Invalidate table handle cache when column changed"
then drop "handle" in "When column handle is changed," -- it's not the column handle what's changing (column handles are immutable), it's the column itself
There was a problem hiding this comment.
BTW why having stale io.trino.plugin.jdbc.JdbcColumnHandle#comment value inside a JdbcTableHandle.columns matters?
I don't think it's used explicitly. If it's used implicitly (via equals), then maybe we have concurrency problem.
i.e. what happens if one query pulls a JdbcTableHandle and then (before the first query planning finishes), some other query performs setColumnComment.
The CachingJdbcClient state will be eventually consistent, but the first query will be planning on an inconsistent state.
There was a problem hiding this comment.
I think the only reasonable usage of the value of the comment is in JdbcColumnHandle.getRetainedSizeInBytes()
How the situation described above differs from the e.g. addColumn?
There was a problem hiding this comment.
I think the only reasonable usage of the value of the comment is in
JdbcColumnHandle.getRetainedSizeInBytes()
so the invalidation doesn't matter.
How the situation described above differs from the e.g.
addColumn?
Good question.
Generally the JDBC table handle won't carry columns until they are projected.
But I can imagine some JDBC connector eagerly populating the JdbcTableHandle.columns field within io.trino.plugin.jdbc.JdbcClient#getTableHandle(io.trino.spi.connector.ConnectorSession, io.trino.spi.connector.SchemaTableName) call. Then, addColumn changes the state in a way that would matter.
There was a problem hiding this comment.
anyway, I am fine with doing invalidateTableCaches here, for consistency.
i just hope this is "for consistency" and not "a fix"
There was a problem hiding this comment.
JdbcTableHandle.columns matters?
It matters. Plenty of connectors for getColumns do a short-circuit and return columns from table handle. If they do not matter we should remove them.
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/CachingJdbcClient.java
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/TestCachingJdbcClient.java
Show resolved
Hide resolved
| // do not throw when invoked, however do not allow to set non-empty comment until the connector supports setting column comments | ||
| verify(comment.isEmpty(), "This connector does not support setting column comments"); |
There was a problem hiding this comment.
This isn't correct anyway.
setColumnComment invoked with empty comment should unset existing comment, so the verify's safety is an illusion.
what about:
// Ignore (not fail) for testing purposes.
without a verify?
There was a problem hiding this comment.
This isn't correct anyway.
true.
unset existing comment
existing comment does not exist, because the connector does not support setting column comments.
And for the similar case with table comment getTableComment always returns Optional.empty(), so it's in tact.
without a verify
I'd left it as it is, because if I test setColumnComment, I'd not test it as setColumnComment(Optional.empty()), but with some value, and this could led to some higher expectations from tests which use this function.
There was a problem hiding this comment.
unset existing comment
existing comment does not exist, because
the connector does not support setting column comments.
it could exist if H2 supports column comments.
Trino is not the only gate to the underlying db
without a verify
I'd left it as it is, because if I test
setColumnComment, I'd not test it assetColumnComment(Optional.empty()), but with some value, and this could led to some higher expectations from tests which use this function.
Fine. Leave it but please make it clear it's not correct, but just as a reminder.
There was a problem hiding this comment.
I was thinking about it. I was thinking that setting empty comment is fine, for connectors that do not support it. However I was wrong:
- It is a difference between empty comment and null comment
- H2 may change and we could give a wrong impression that support it.
There was a problem hiding this comment.
Yes, it's controversial. So better trade-off is not to test caching for addComment.
There was a problem hiding this comment.
As agreed, left it as it is now.
findepi
left a comment
There was a problem hiding this comment.
"Fix verify in DefaultJdbcMetadata#applyProjection"
| verify(tableColumnSet.containsAll(newColumnSet), "applyProjection called with columns %s and some are not available in existing query: %s", newColumnSet, tableColumnSet); | ||
| Set<JdbcColumnHandle> tableSyntheticColumnSet = ImmutableSet.<JdbcColumnHandle>builder() | ||
| .addAll(tableColumnSet) | ||
| .add((JdbcColumnHandle) getDeleteRowIdColumnHandle(session, table)) |
There was a problem hiding this comment.
Make getDeleteRowIdColumnHandle return JdbcColumnHandle and remove the cast here.
There was a problem hiding this comment.
@ssheikin it looks like a bug fix. do you have a query reproducing the problem?
|
@ssheikin is it still a Draft, or for review? |
|
I've addressed comments for the first commit and extracted it as a separate PR: #14762 |
Description
Non-technical explanation
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: