Bulk fetch all columns from all tables in JDBC connectors #22241

hashhar · 2024-06-03T09:33:03Z

Description

Before this change, when listing table columns, JDBC connectors would first list tables and then list columns of a table. Thus, when serving Trino's information_schema.columns or system.jdbc.columns, we would make O(#tables) calls to the remote database.

With this change, we utilize remote database's bulk column listing facilities to satisfy Trino's bulk column listing requests. This can be viewed as "information_schema.columns pass-through", although this works for both Trino's information_schema.columns and Trino's system.jdbc.columns
(io.trino.jdbc.TrinoDatabaseMetaData.getColumns), and does not use remote database's information_schema.columns directly. Instead, the commit leverages the fact that DatabaseMetaData.getColumns typically used to get columns of a table can be used without a table filter, and then it gets all columns from all tables.

The bulk retrieval is supported for selected JDBC connectors, and by default is not supported (requires JdbcClient changes).

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# MariaDB, MySQL, SingleStore, Redshift
* Improve performance of listing table columns. ({issue}`issuenumber`)

Before this change, when listing table columns, JDBC connectors would first list tables and then list columns of a table. Thus, when serving Trino's `information_schema.columns` or `system.jdbc.columns`, we would make O(#tables) calls to the remote database. With this change, we utilize remote database's bulk column listing facilities to satisfy Trino's bulk column listing requests. This can be viewed as "`information_schema.columns` pass-through", although this works for both Trino's `information_schema.columns` and Trino's `system.jdbc.columns` (`io.trino.jdbc.TrinoDatabaseMetaData.getColumns`), and does not use remote database's `information_schema.columns` directly. Instead, the commit leverages the fact that `DatabaseMetaData.getColumns` typically used to get columns of a table can be used without a table filter, and then it gets all columns from all tables. The bulk retrieval is supported for selected JDBC connectors, and by default is not supported (requires `JdbcClient` changes). Co-authored-by: Ashhar Hasan <[email protected]>

hashhar requested review from ebyhr and findepi June 3, 2024 09:33

cla-bot bot added the cla-signed label Jun 3, 2024

findepi approved these changes Jun 3, 2024

View reviewed changes

ebyhr approved these changes Jun 3, 2024

View reviewed changes

hashhar merged commit 1ac1ee1 into trinodb:master Jun 4, 2024

hashhar deleted the hashhar/bulk-fetch-all-columns branch June 4, 2024 08:59

github-actions bot added this to the 450 milestone Jun 4, 2024

This was referenced Jun 5, 2024

Implement getAllTableColumns in Snowflake #22264

Closed

Disable testBulkColumnListingOptions in Snowflake #22265

Merged

colebow mentioned this pull request Jun 7, 2024

Add Trino 450 release notes #22327

Merged

Praveen2112 mentioned this pull request Mar 6, 2025

Bulk fetch all columns from all tables per schema for Oracle connector #25231

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bulk fetch all columns from all tables in JDBC connectors #22241

Bulk fetch all columns from all tables in JDBC connectors #22241

Uh oh!

hashhar commented Jun 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Bulk fetch all columns from all tables in JDBC connectors #22241

Bulk fetch all columns from all tables in JDBC connectors #22241

Uh oh!

Conversation

hashhar commented Jun 3, 2024

Description

Release notes

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants