Skip to content

Conversation

@pajaks
Copy link
Member

@pajaks pajaks commented Sep 18, 2024

Description

Right now case-insensitive-cache may contain only one value (in most cases) with projectId as key and names mapping as value. This causes cache to be build once and is never updated until it's invalidated. Also it's build with values available from one query only. It's not big problem for schema as for all queries we always supply cache from client.listDatasetIds(projectId).
For table it's buggy as some queries have limited information.
For example:

  • select from single table will return only one list element with selected table from BigQuery, we will store only this table in cache mapping
  • next select for another table will fetch mapping from cache and as new table is not present there it will return empty result
  • empty result will cause table not found exception

This PR changes cache structure to be based on table/schema name as key. Also cache is updated on every query if key is not found.

Additional context and related issues

Change in test to remove .put("bigquery.service-cache-ttl", "0ms") is needed to get cache working. With this property set to 0. BigQueryClient is initialised for each query, which results in building cache from scratch for each query.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# BigQuery
* Fix query failure when `bigquery.service-cache-ttl` config property isn't `0ms` and 
  case insensitive name matching is enabled ({issue}`23481`)

@cla-bot cla-bot bot added the cla-signed label Sep 18, 2024
@pajaks pajaks force-pushed the pajaks/bigquery_cache_fix branch from cacc1ed to 0c40ca2 Compare September 18, 2024 10:06
@ebyhr
Copy link
Member

ebyhr commented Sep 18, 2024

/test-with-secrets sha=0c40ca249b2b6874f0f0feaa9c7774a8ad6b03c4

@github-actions github-actions bot added the bigquery BigQuery connector label Sep 18, 2024
@github-actions
Copy link

github-actions bot commented Sep 18, 2024

The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/10920739075

@pajaks pajaks force-pushed the pajaks/bigquery_cache_fix branch from 0c40ca2 to ae6499b Compare September 18, 2024 11:56
@pajaks pajaks requested review from SemionPar and ebyhr September 19, 2024 08:28
@pajaks pajaks force-pushed the pajaks/bigquery_cache_fix branch from ae6499b to 847cecb Compare September 19, 2024 08:29
@ebyhr ebyhr requested a review from hashhar September 19, 2024 10:20
@pajaks pajaks force-pushed the pajaks/bigquery_cache_fix branch from 847cecb to ff782a7 Compare September 20, 2024 07:28
@ebyhr
Copy link
Member

ebyhr commented Sep 20, 2024

/test-with-secrets sha=ff782a76f6a13e7516f68ed0ec044207d73b53c6

@github-actions
Copy link

github-actions bot commented Sep 20, 2024

The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/10957801373

@pajaks
Copy link
Member Author

pajaks commented Sep 20, 2024

The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/10957801373

Failed on not related pt (default, suite-azure, )

@ebyhr ebyhr merged commit 7761b45 into trinodb:master Sep 20, 2024
@github-actions github-actions bot added this to the 459 milestone Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bigquery BigQuery connector cla-signed

Development

Successfully merging this pull request may close these issues.

3 participants