Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bigquery): create read session with client or job projectID #10932

Conversation

alvarowolfx
Copy link
Contributor

@alvarowolfx alvarowolfx commented Sep 30, 2024

When reading result sets using the Storage Read API Acceleration enabled, currently the read session is created by default in the table's project. This works for cases where the destination table is not specified and automatically created, which defaults to the project where the the query or job was created. But when reading a table directly or specifying a destination table, it doesn't work in cases where the client doesn't have BQ Storage permissions (just table read permission for example). This is a common use case where some customers have a main billing project and this project has access to other GCP projects with just permission to read data from BigQuery tables.

With this PR, we default to use the defined Query/Job projectID (which defaults to the current bigquery.Client.projectID or when reading the a table directly, we also use default to the bigquery.Client.projectID.

Reported initially on PR #10924

Supersedes #10924

@alvarowolfx alvarowolfx requested review from a team as code owners September 30, 2024 00:19
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Sep 30, 2024
@alvarowolfx
Copy link
Contributor Author

@juanli16 @fsaintjacques I think this should have been the default behavior in the first place, which solves for your use cases and doesn't require to add overrides. Let me know if makes sense.

@fsaintjacques
Copy link

@alvarowolfx I don't mind the fix since it won't break our use case, but beware that this might break (due to IAM) the existing users of this library.

@alvarowolfx
Copy link
Contributor Author

@fsaintjacques thanks for raising the concern. I think was a mistake in the first place to use the table projectID to create session, as for the basic use case of reading public datasets, would fail with this approach.

Other than that, I think I can see two major groups here:

  • For users using through the Query.Read and Job.Read paths, I don't think is going to break anything as the destination table project was already the same as the one set on the Query or Job.
  • For users reading a table directly (or setting a destination table on a query), I feel like most users doing that was actually facing the same problem as you're facing, where they didn't have permission to create BigQuery Storage Read sessions. All customers that I worked with have a similar set up that you described, a centralized billing project ( set on the client ) and can read result sets on other projects with just table read permissions.
    • It would break for users that have BQ Storage permission on the table being read but not on the project set on the client (which sounds like an edge case)

Do you see any other scenario that would break ?

In any case, I think we can move with PR #10924 in parallel, to allow users to set the project used to create sessions.

juanli16 added a commit to juanli16/google-cloud-go that referenced this pull request Sep 30, 2024
@fsaintjacques
Copy link

Nope, I don't foresee any other issues. I wanted to warn you that this was potentially breaking some existing (un)lucky setup.

@alvarowolfx alvarowolfx added the automerge Merge the pull request once unit tests and other checks pass. label Oct 1, 2024
@gcf-merge-on-green gcf-merge-on-green bot merged commit f98396e into googleapis:main Oct 1, 2024
8 checks passed
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Oct 1, 2024
juanli16 added a commit to juanli16/google-cloud-go that referenced this pull request Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants