feat: Support concat(..., how="diagonal") for ibis#3404
Conversation
Managed to cause **33!** yells from pyright originally
after aligning, all frames are guaranteed to have the same column names in the same order checking the dtypes *too*, when `ibis` will handle this already seems excessive
Surprised we don't have tests for this
| ) -> None: | ||
| if "ibis" in str(constructor): | ||
| request.applymarker(pytest.mark.xfail) | ||
| def test_concat_diagonal(constructor: Constructor) -> None: |
There was a problem hiding this comment.
I do think there's something here, I just wanna experiment a bit (and add more tests) ❤️
@FBruzzesi okay this led to discovering a bug (but not in any of the new code or your suggestion)
ibis is the only backend that doesn't guarantee the order of union.
I found that out by using the 3-tabled version of the test:
That fails for ibis, but it turns out "vertical" fails too:
Show test
def test_concat_vertical_bigger(constructor: Constructor) -> None:
data_1 = {"a": [1, 2], "b": [3, 4], "c": [0, None]}
data_2 = {"a": [5, 6], "b": [0, None], "c": [7, 8]}
data_3 = {"a": [0, None], "b": [9, 10], "c": [11, 12]}
expected = {
"a": [1, 2, 5, 6, 0, None],
"b": [3, 4, 0, None, 9, 10],
"c": [0, None, 7, 8, 11, 12],
}
df_1 = nw.from_native(constructor(data_1)).lazy()
df_2 = nw.from_native(constructor(data_2)).lazy()
df_3 = nw.from_native(constructor(data_3)).lazy()
result = nw.concat([df_1, df_2, df_3], how="vertical")
assert_equal_data(result, expected)Show error
E AssertionError: Mismatch at index 0, key a: 0 != 1
E Expected: {'a': [1, 2, 5, 6, 0, None], 'b': [3, 4, 0, None, 9, 10], 'c': [0, None, 7, 8, 11, 12]}
E Got: {'a': [0, None, 5, 6, 1, 2], 'b': [9, 10, 0, None, 3, 4], 'c': [11, 12, 7, 8, 0, None]}I suppose I'm stuck with testing two tables for now then 😂
I've added (a8e8388), but should probably follow this up with another issue.
(We (and polars) don't document that it is ordered, but we do test for it and polars.union was recently introduced for unordered)
There was a problem hiding this comment.
ibis is the only backend that doesn't guarantee the order of union.
For a moment I thought order of columns, and I panicked sooo much 🤯
No guarantee in row order makes sense. You can add an index column and sort by such
for more information, see https://pre-commit.ci
Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com> #3404 (comment) #3404 (comment)
FBruzzesi
left a comment
There was a problem hiding this comment.
Thanks @dangotbanned - I left just one comment for the ibis implementation.
On the protocol side I think we debated enough yesterday 😇
Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>

Description
While reviewing #3398, one option I raised would be to adopt how
polarshandleshow="diagonal".Whether we do/don't can be decided on later, but by adding support for
ibisnow ...Important
All backends support
concat={"diagonal", "vertical"}Related issues
concat(..., how="*_relaxed"})#3398 (comment)