Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): add TableUnnest operation to support cross-join unnest semantics as well as offset #9423

Merged
merged 8 commits into from
Jul 1, 2024

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Jun 21, 2024

Would love to get feedback from folks on the API especially @tswast. The new test is likely to fail on many backends. Looking for a bit of feedback before beefing up the test suite.

Closes #7781.

@cpcloud cpcloud added feature Features or general enhancements bigquery The BigQuery backend sql Backends that generate SQL labels Jun 21, 2024
@cpcloud cpcloud force-pushed the bigquery-table-unnest branch from a5ada45 to ff89731 Compare June 21, 2024 16:31
@cpcloud cpcloud requested a review from tswast June 21, 2024 16:50
@cpcloud cpcloud force-pushed the bigquery-table-unnest branch from ceddad0 to dc0460f Compare June 21, 2024 16:52
@cpcloud cpcloud force-pushed the bigquery-table-unnest branch 4 times, most recently from ede6499 to 85360c8 Compare June 24, 2024 18:34
@cpcloud
Copy link
Member Author

cpcloud commented Jun 24, 2024

Ok, I've started going down the rabbit hole of supporting this on as many of our backends that support arrays as possible.

@cpcloud cpcloud added this to the 9.2 milestone Jun 24, 2024
@cpcloud cpcloud marked this pull request as ready for review June 24, 2024 19:29
@cpcloud cpcloud force-pushed the bigquery-table-unnest branch 12 times, most recently from 3a271da to 7aed2e7 Compare June 25, 2024 14:33
@cpcloud cpcloud added postgres The PostgreSQL backend pyspark The Apache PySpark backend duckdb The DuckDB backend labels Jun 25, 2024
@cpcloud cpcloud force-pushed the bigquery-table-unnest branch from 7aed2e7 to 6a8ab90 Compare June 26, 2024 16:29
@cpcloud cpcloud added the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Jun 26, 2024
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Jun 26, 2024
@cpcloud cpcloud requested review from jcrist and gforsyth June 26, 2024 18:23
ibis/backends/bigquery/compiler.py Show resolved Hide resolved
ibis/backends/bigquery/compiler.py Outdated Show resolved Hide resolved
["y", lambda t: t.y, ibis._.y],
ids=["string", "lambda", "deferred"],
)
def test_table_unnest(backend, colspec):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if table.unnest works with doubly nested (array<struct<array<>>) fields? If so, could you add a test for that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case in 971b005 (#9423). Let me know whether it's sufficient.

@cpcloud cpcloud force-pushed the bigquery-table-unnest branch 4 times, most recently from 692371b to d412da0 Compare June 27, 2024 10:45
@cpcloud cpcloud requested a review from tswast June 27, 2024 10:47
Copy link
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 SQL backends, 6 slightly different (or just plain gnarly) SQL implementations.

Yay standards!

Anyway, this looks good to me -- really nice docstring for Table.unnest

@cpcloud cpcloud force-pushed the bigquery-table-unnest branch from d412da0 to 268299b Compare June 27, 2024 17:35
@cpcloud cpcloud added the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Jun 27, 2024
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Jun 27, 2024
@cpcloud
Copy link
Member Author

cpcloud commented Jul 1, 2024

Rebasing, will merge on green.

@cpcloud cpcloud force-pushed the bigquery-table-unnest branch from 268299b to ee0280c Compare July 1, 2024 13:25
@cpcloud
Copy link
Member Author

cpcloud commented Jul 1, 2024

Clouds are good:

…/ibis on  bigquery-table-unnest is 📦 v9.1.0 via 🐍 v3.10.14 via ❄️   impure (ibis-3.10.14-env) took 30s
❯ pytest -m 'bigquery or snowflake' -n 8 --dist loadgroup --snapshot-update -q
bringing up nodes...
.....................x.x.....................x...........x...x.x........xx.....x....x.................x.x..xx............x..s....x.x................................x................. [  4%]
.........x.......x...........................x......................x..x.xxx....x..xxxx..xx.......x...x...x......x....................s...................s..........................s [  9%]
.....................x.....x....x...x.....xx.........................x.....................x..x.......x.........x....x.................x..x..............x.....................x...... [ 14%]
........x.......................x....x......xx..x.x.....x.....x.x........x......x..x..........x....xx....xx.x...x....x..........x.x........x...................x......x.....xx.x.....x [ 19%]
.........x...xx.xxxxxx.x.x.......x.x..xxx.x....x.x........x..............x......x....................x.x..xx....xx..x.......................................x...........xx............ [ 24%]
....x............................x...x.....x......................xx.......x......................x........xx.x.....x..............x..........x........x...x........x............x.... [ 29%]
..x.............x..........x.............sssssssssssssssssssss.............x............x...xs..............x........x..x.............x....................x.......................... [ 34%]
......................................x........................................x.......................x.........................x.....x......s.x..............x.xxx..x...xxx...xxx.xx [ 39%]
xxx..xx.xx.x.xx..xxxxx.x.xx..xxx........x..xx...x..x...xxx.xxxxxxx..xxx....x....x..x.xxx....x...xxx...xx.x........x...xxx..xx.x...x.xx.x...x..xx.xx....xx..x.xx.xxx....xxx......x..... [ 44%]
.xx........................x.......x...........................x................x............................x.........xx.....x...........x..x..................x.x.........xxxx.xxx.x [ 49%]
xx...xx...x......x.x..x...........s.....s...............................................s.........x.x..x........x.............xx....x........x.......x.............................x.. [ 54%]
........x...........x.........................x....................................................x.........................................x................xx..x..x........x....x.. [ 59%]
.x.....x...x.........x...x.......x.xx.x.............x..x...x.x.........x.x.........x.x....x......xxxxxxxx..x.x.x.x...x.xx.x..x.....xxx..x...x.................xx...x......x.......x.xx [ 64%]
..x..x.............x..........x.......x..x.......x....xx............x.x..........x.....x..x.x..x..xx.......x.........x..............x..x.........................................x.x.. [ 69%]
....x.....xx.........x...x.....x..x...x..xx..........................x............x..x....xx....xx...x......................x..x..x......................x...........x......s.....x... [ 73%]
......................xxx......x........xx.............x........x...xx..x....xxx....x.x..x..............x...x...x.xx..........x......x.x.....x.....x...x...x....s..x..x............... [ 78%]
.........s....x...x...x.x..s.s.x.........xxxxxxx..xx..x..xxxxxxxxxxx.xxxxxxxxxxxxxxx.xxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.sxxxxsx.xx.xx.... [ 83%]
...........................x.......................x.........................................................................s.....................................x.x................ [ 88%]
.................................s.................................................................................................................................................... [ 93%]
..........................................................................................................x........................................................................... [ 98%]
..................................................                                                                                                                                     [100%]
3099 passed, 39 skipped, 552 xfailed in 835.57s (0:13:55)

@cpcloud cpcloud merged commit 3352a84 into ibis-project:main Jul 1, 2024
75 of 76 checks passed
@cpcloud cpcloud deleted the bigquery-table-unnest branch July 1, 2024 14:22
@cpcloud cpcloud mentioned this pull request Jul 3, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery The BigQuery backend duckdb The DuckDB backend feature Features or general enhancements postgres The PostgreSQL backend pyspark The Apache PySpark backend risingwave The RisingWave backend snowflake The Snowflake backend sql Backends that generate SQL trino The Trino backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat(bigquery): ArrayValue.as_table(offset_name: str | None) as a table-valued function
3 participants