Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cursor description support for ARRAY native type #1128

Open
gallodustin opened this issue Oct 16, 2024 · 1 comment
Open

Add cursor description support for ARRAY native type #1128

gallodustin opened this issue Oct 16, 2024 · 1 comment
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API.

Comments

@gallodustin
Copy link

gallodustin commented Oct 16, 2024

Is your feature request related to a problem? Please describe.

When fetching column types from the cursor.description based on the result of executing a BQ query, ARRAY types seem to be misrepresented. For example:

conn.execute("CREATE TABLE my_project.my_dataset.new_table (numbers ARRAY<INT64>)")
conn.execute("INSERT INTO my_project.my_dataset.new_table (numbers) VALUES ([1, 2, 3])")
result = conn.execute("SELECT * FROM my my_project.my_dataset.new_table")
print(result.cursor.description)
# Column(name='numbers', type_code='INTEGER', display_size=None, internal_size=None, precision=None, scale=None, null_ok=False)

This causes my consuming code to incorrectly interpret the type of that column, which leads to failures down the line.

Describe the solution you'd like

Would it be possible to have the cursor description in this case include the ARRAY type instead? I see that the values fetchall() returns from this result always include brackets [] which seems parse-able to me. Also, ARRAY already exists in the type map, so I assume there is some level of support already.

Describe alternatives you've considered

As a workaround, it is possible to get the proper ARRAY<INT64> type by instead querying INFORMATION_SCHEMA. However, it would be more convenient and performant in many cases to use the cursor description of a SELECT * ... LIMIT 0 query. More generally, I expect those types to be accurate.

Additional context

This problem remains if the LIMIT 0 is removed, by the way, as in my example at the top of the post.

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Oct 16, 2024
@gallodustin
Copy link
Author

gallodustin commented Oct 16, 2024

Perhaps the underlying difficulty here is that the structure of the cursor description itself doesn't allow any clear way to communicate that this column type is ARRAY and simultaneously that the values inside the ARRAY are of type INT64? In this case, at least returning type ARRAY would give consuming code a clear "heads up" that some special handling needs to happen for this column, rather than perpetuating the false assumption that the column is the same as any other INT64 column.

I recognize that would be a breaking change, but to me at least it seems like a clearer way of communicating the situtation based on current limitations.

Edit: even more helpful would be something like ARRAY[INT64].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API.
Projects
None yet
Development

No branches or pull requests

2 participants