Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -412,28 +412,28 @@ public void setTableComment(ConnectorSession session, JdbcTableHandle handle, Op
public void setColumnComment(ConnectorSession session, JdbcTableHandle handle, JdbcColumnHandle column, Optional<String> comment)
{
delegate.setColumnComment(session, handle, column, comment);
invalidateColumnsCache(handle.asPlainTable().getSchemaTableName());
invalidateTableCaches(handle.asPlainTable().getSchemaTableName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmt msg

Invalidate caches more coarsely

Table handle contains column handles. When column handle is changed,
data in cache for table handles is outdated.

I'd suggest title like "Invalidate table handle cache when column changed"

then drop "handle" in "When column handle is changed," -- it's not the column handle what's changing (column handles are immutable), it's the column itself

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW why having stale io.trino.plugin.jdbc.JdbcColumnHandle#comment value inside a JdbcTableHandle.columns matters?

I don't think it's used explicitly. If it's used implicitly (via equals), then maybe we have concurrency problem.
i.e. what happens if one query pulls a JdbcTableHandle and then (before the first query planning finishes), some other query performs setColumnComment.
The CachingJdbcClient state will be eventually consistent, but the first query will be planning on an inconsistent state.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reasonable usage of the value of the comment is in JdbcColumnHandle.getRetainedSizeInBytes()

How the situation described above differs from the e.g. addColumn?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reasonable usage of the value of the comment is in JdbcColumnHandle.getRetainedSizeInBytes()

so the invalidation doesn't matter.

How the situation described above differs from the e.g. addColumn?

Good question.
Generally the JDBC table handle won't carry columns until they are projected.

tableHandles.add(new JdbcTableHandle(schemaTableName, getRemoteTable(resultSet), getTableComment(resultSet)));

But I can imagine some JDBC connector eagerly populating the JdbcTableHandle.columns field within io.trino.plugin.jdbc.JdbcClient#getTableHandle(io.trino.spi.connector.ConnectorSession, io.trino.spi.connector.SchemaTableName) call. Then, addColumn changes the state in a way that would matter.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, I am fine with doing invalidateTableCaches here, for consistency.
i just hope this is "for consistency" and not "a fix"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JdbcTableHandle.columns matters?

It matters. Plenty of connectors for getColumns do a short-circuit and return columns from table handle. If they do not matter we should remove them.

}

@Override
public void addColumn(ConnectorSession session, JdbcTableHandle handle, ColumnMetadata column)
{
delegate.addColumn(session, handle, column);
invalidateColumnsCache(handle.asPlainTable().getSchemaTableName());
invalidateTableCaches(handle.asPlainTable().getSchemaTableName());
}

@Override
public void dropColumn(ConnectorSession session, JdbcTableHandle handle, JdbcColumnHandle column)
{
delegate.dropColumn(session, handle, column);
invalidateColumnsCache(handle.asPlainTable().getSchemaTableName());
invalidateTableCaches(handle.asPlainTable().getSchemaTableName());
}

@Override
public void renameColumn(ConnectorSession session, JdbcTableHandle handle, JdbcColumnHandle jdbcColumn, String newColumnName)
{
delegate.renameColumn(session, handle, jdbcColumn, newColumnName);
invalidateColumnsCache(handle.asPlainTable().getSchemaTableName());
invalidateTableCaches(handle.asPlainTable().getSchemaTableName());
}

@Override
Expand Down Expand Up @@ -578,6 +578,12 @@ CacheStats getTableNamesCacheStats()
return tableNamesCache.stats();
}

@VisibleForTesting
CacheStats getTableHandlesByNameCacheStats()
{
return tableHandlesByNameCache.stats();
}

@VisibleForTesting
CacheStats getTableHandlesByQueryCacheStats()
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,12 @@ public Optional<ProjectionApplicationResult<ConnectorTableHandle>> applyProjecti
return Optional.empty();
}

verify(tableColumnSet.containsAll(newColumnSet), "applyProjection called with columns %s and some are not available in existing query: %s", newColumnSet, tableColumnSet);
Set<JdbcColumnHandle> tableSyntheticColumnSet = ImmutableSet.<JdbcColumnHandle>builder()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to provide a test?

Copy link
Copy Markdown
Contributor Author

@ssheikin ssheikin Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raunaqmorarka @sopel39 could you please advice which connector invokes PruneTableScanColumns#pruneColumns e.g. running BaseConnectorTest#testDeleteWithSubquery

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That rule should trigger whenever there are more columns output from the table scan than are used by the rest of the query plan. E.g. select nationkey, count(*) from (select * from nation) group by 1 in tpch.tiny

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssheikin it looks like a bug fix. do you have a query reproducing the problem?

.addAll(tableColumnSet)
.add((JdbcColumnHandle) getDeleteRowIdColumnHandle(session, table))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make getDeleteRowIdColumnHandle return JdbcColumnHandle and remove the cast here.

.build();

verify(tableSyntheticColumnSet.containsAll(newColumnSet), "applyProjection called with columns %s and some are not available in existing query: %s", newColumnSet, tableSyntheticColumnSet);
}

return Optional.of(new ProjectionApplicationResult<>(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
import io.trino.spi.statistics.Estimate;
import io.trino.spi.statistics.TableStatistics;
import io.trino.testing.TestingConnectorSession;
import org.jetbrains.annotations.NotNull;
import org.testng.annotations.AfterMethod;
import org.testng.annotations.BeforeMethod;
import org.testng.annotations.Test;
Expand All @@ -55,6 +56,7 @@
import static com.google.common.collect.ImmutableMap.toImmutableMap;
import static io.airlift.concurrent.Threads.daemonThreadsNamed;
import static io.trino.plugin.jdbc.TestCachingJdbcClient.CachingJdbcCache.STATISTICS_CACHE;
import static io.trino.plugin.jdbc.TestCachingJdbcClient.CachingJdbcCache.TABLE_HANDLES_BY_NAME_CACHE;
import static io.trino.plugin.jdbc.TestCachingJdbcClient.CachingJdbcCache.TABLE_HANDLES_BY_QUERY_CACHE;
import static io.trino.spi.session.PropertyMetadata.stringProperty;
import static io.trino.spi.testing.InterfaceTestUtils.assertAllMethodsOverridden;
Expand Down Expand Up @@ -273,6 +275,38 @@ public void testTableHandleOfQueryCached()
dropTable(phantomTable);
}

@Test
public void testTableHandleInvalidatedOnColumnsModifications()
{
JdbcTableHandle table = createTable(new SchemaTableName(schema, "a_table"));
JdbcColumnHandle existingColumn = addColumn(table, "a_column");

// warm-up cache
assertTableHandlesByNameCacheIsInvalidated(table);
JdbcColumnHandle newColumn = addColumn(cachingJdbcClient, table, "new_column");
assertTableHandlesByNameCacheIsInvalidated(table);
cachingJdbcClient.setColumnComment(SESSION, table, newColumn, Optional.empty());
assertTableHandlesByNameCacheIsInvalidated(table);
cachingJdbcClient.renameColumn(SESSION, table, newColumn, "new_column_name");
assertTableHandlesByNameCacheIsInvalidated(table);
cachingJdbcClient.dropColumn(SESSION, table, existingColumn);
assertTableHandlesByNameCacheIsInvalidated(table);

dropTable(table);
}

private void assertTableHandlesByNameCacheIsInvalidated(JdbcTableHandle table)
{
SchemaTableName tableName = table.asPlainTable().getSchemaTableName();

assertCacheStats(cachingJdbcClient, TABLE_HANDLES_BY_NAME_CACHE).misses(1).loads(1).afterRunning(() -> {
assertThat(cachingJdbcClient.getTableHandle(SESSION, tableName).orElseThrow()).isEqualTo(table);
});
assertCacheStats(cachingJdbcClient, TABLE_HANDLES_BY_NAME_CACHE).hits(1).afterRunning(() -> {
assertThat(cachingJdbcClient.getTableHandle(SESSION, tableName).orElseThrow()).isEqualTo(table);
});
}

@Test
public void testEmptyTableHandleIsCachedWhenCacheMissingIsTrue()
{
Expand Down Expand Up @@ -304,6 +338,11 @@ private JdbcTableHandle createTable(SchemaTableName phantomTable)
return jdbcClient.getTableHandle(SESSION, phantomTable).orElseThrow();
}

private void dropTable(JdbcTableHandle tableHandle)
{
jdbcClient.dropTable(SESSION, tableHandle);
}

private void dropTable(SchemaTableName phantomTable)
{
JdbcTableHandle tableHandle = jdbcClient.getTableHandle(SESSION, phantomTable).orElseThrow();
Expand Down Expand Up @@ -819,10 +858,16 @@ private JdbcColumnHandle addColumn(JdbcTableHandle tableHandle)
}

private JdbcColumnHandle addColumn(JdbcTableHandle tableHandle, String columnName)
{
return addColumn(jdbcClient, tableHandle, columnName);
}

@NotNull
private JdbcColumnHandle addColumn(JdbcClient client, JdbcTableHandle tableHandle, String columnName)
{
ColumnMetadata columnMetadata = new ColumnMetadata(columnName, INTEGER);
jdbcClient.addColumn(SESSION, tableHandle, columnMetadata);
return jdbcClient.getColumns(SESSION, tableHandle)
client.addColumn(SESSION, tableHandle, columnMetadata);
return client.getColumns(SESSION, tableHandle)
.stream()
.filter(jdbcColumnHandle -> jdbcColumnHandle.getColumnMetadata().equals(columnMetadata))
.findAny()
Expand Down Expand Up @@ -995,6 +1040,7 @@ enum CachingJdbcCache
{
TABLE_NAMES_CACHE(CachingJdbcClient::getTableNamesCacheStats),
TABLE_HANDLES_BY_QUERY_CACHE(CachingJdbcClient::getTableHandlesByQueryCacheStats),
TABLE_HANDLES_BY_NAME_CACHE(CachingJdbcClient::getTableHandlesByNameCacheStats),
COLUMNS_CACHE(CachingJdbcClient::getColumnsCacheStats),
STATISTICS_CACHE(CachingJdbcClient::getStatisticsCacheStats),
/**/;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import java.util.Map;
import java.util.Optional;

import static com.google.common.base.Verify.verify;
import static io.trino.plugin.jdbc.StandardColumnMappings.bigintColumnMapping;
import static io.trino.plugin.jdbc.StandardColumnMappings.bigintWriteFunction;
import static io.trino.plugin.jdbc.StandardColumnMappings.booleanColumnMapping;
Expand Down Expand Up @@ -109,6 +110,13 @@ public Optional<String> getTableComment(ResultSet resultSet)
return Optional.empty();
}

@Override
public void setColumnComment(ConnectorSession session, JdbcTableHandle handle, JdbcColumnHandle column, Optional<String> comment)
{
// do not throw when invoked, however do not allow to set non-empty comment until the connector supports setting column comments
verify(comment.isEmpty(), "This connector does not support setting column comments");
Comment on lines +116 to +117
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct anyway.
setColumnComment invoked with empty comment should unset existing comment, so the verify's safety is an illusion.

what about:

// Ignore (not fail) for testing purposes.

without a verify?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct anyway.

true.

unset existing comment

existing comment does not exist, because the connector does not support setting column comments.
And for the similar case with table comment getTableComment always returns Optional.empty(), so it's in tact.

without a verify

I'd left it as it is, because if I test setColumnComment, I'd not test it as setColumnComment(Optional.empty()), but with some value, and this could led to some higher expectations from tests which use this function.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unset existing comment

existing comment does not exist, because the connector does not support setting column comments.

it could exist if H2 supports column comments.

Trino is not the only gate to the underlying db

without a verify

I'd left it as it is, because if I test setColumnComment, I'd not test it as setColumnComment(Optional.empty()), but with some value, and this could led to some higher expectations from tests which use this function.

Fine. Leave it but please make it clear it's not correct, but just as a reminder.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it. I was thinking that setting empty comment is fine, for connectors that do not support it. However I was wrong:

  • It is a difference between empty comment and null comment
  • H2 may change and we could give a wrong impression that support it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's controversial. So better trade-off is not to test caching for addComment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As agreed, left it as it is now.

}

@Override
public boolean supportsAggregationPushdown(ConnectorSession session, JdbcTableHandle table, List<AggregateFunction> aggregates, Map<String, ColumnHandle> assignments, List<List<ColumnHandle>> groupingSets)
{
Expand Down