Skip to content

feat: IbisLazyFrame support#2000

Merged
MarcoGorelli merged 123 commits intonarwhals-dev:mainfrom
rwhitten577:feature/initial-ibis-lazyframe
May 13, 2025
Merged

feat: IbisLazyFrame support#2000
MarcoGorelli merged 123 commits intonarwhals-dev:mainfrom
rwhitten577:feature/initial-ibis-lazyframe

Conversation

@rwhitten577
Copy link
Contributor

@rwhitten577 rwhitten577 commented Feb 12, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

@NickCrews
Copy link

I want ibis support in narwhals and can help to get this landed. Ping me if you want a review or help!

Cc for visibility @cpcloud, the maintainer of ibis

@MarcoGorelli
Copy link
Member

thanks @NickCrews for your help! sure - we're in the middle of some large refactors so it may be prudent to wait a bit to avoid too many merge conflicts, but we will get to this

@dangotbanned
Copy link
Member

ibis.selectors

I'm interested in how we might be able to take advantage of ibis.selectors.
I noted in (#2064 (comment)) ibis is an outlier because of their native support

Initial thoughts are that our adaptation could work quite differently to the other backends - maybe even being more performant

Since (#2064) every backend besides polars is sharing a lot of code - which recently moved to nw._compliant.selectors.
Most of this is simply performing operations on nw.Schema

@MarcoGorelli
Copy link
Member

Hey @rwhitten577 - we're done with the big refactors, so if it interests you to continue, we'd love to ship this 🚢 🚀

I'd suggest not worrying about doing anything different for selectors just yet (it looks to me like they call .schema anyway?), mirroring what we do for duckdb / sqlframe / pyspark should be fine 👍

@afrisgaard
Copy link

How would this land? (Just considering to be ready when ibis ships to Narwhals :) )

Something like this:
ibis_frame = ibis.use_backend("BigQuery"/"DuckDb"/etc)

NarwhalsFrame = from_native(ibis_frame)

.. do narwhals operations

@MarcoGorelli
Copy link
Member

yup,that's right! we'd just translate to the ibis api, so whatever backend is set there would be used

@rwhitten577
Copy link
Contributor Author

Hey @MarcoGorelli, I will try to pick this back up in the next few weeks. I haven't looked through all the refactor changes yet, but do you think I'd be better off rebasing this old branch or are the changes significant enough where I'd want to start over and use duckdb or spark as an example?

@MarcoGorelli
Copy link
Member

awesome, thanks @rwhitten577 !

i think it might be salvageable to continue from here, but still using _spark_like / _duckdb as a reference. looks like most of the merge conflicts are in the tests anyway

@rwhitten577 rwhitten577 force-pushed the feature/initial-ibis-lazyframe branch from b0763ae to 82c41f8 Compare April 14, 2025 16:40
@rwhitten577 rwhitten577 force-pushed the feature/initial-ibis-lazyframe branch from 5b12e12 to 15e89c9 Compare April 14, 2025 21:15
They were being typed as `Any` as they used `ibis.__init__.__getattr__`
Copy link
Member

@dangotbanned dangotbanned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your work on this @rwhitten577!

With everything in (#2000 (review)) resolved - I'm approving ❤️

Definitely would need an approval from @MarcoGorelli or @FBruzzesi as well - but I'm happy with where this is at 🎉

@dangotbanned dangotbanned added the ibis Issue is related to ibis backend label May 8, 2025
@MarcoGorelli MarcoGorelli mentioned this pull request May 10, 2025
6 tasks
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks @rwhitten577 and @dangotbanned for reviewing!

Got some mostly minor comments. I've split out some follow-ups in #2525

The most important comments from my end are:

  • not casting to string for numerical ops
  • not repeatedly calling .drop but instead collecting the items to drop into a list first (in join). In theory I think Ibis should be able to optimise this away, but I don't know if they do, so it's probably safer to just call drop once with everything we need to drop

Once this is addressed, we can ship it

context: _FullContext,
) -> Self:
def func(df: IbisLazyFrame) -> list[ir.Column]:
return [df.native[name] for name in evaluate_column_names(df)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just use col(name) here, like we do for spark-like / duckdb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only do getattr(col, name) where col == ibis._. This leads to test errors though where Ibis cannot determine the data type of a deferred value when done this way. Accessing from the native df is the only way I've found to reference the actual column with type info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks for explaining

aiming for a full review today, hopefully we can ship it

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @rwhitten577, and @dangotbanned for reviewing

happy with the code now, and we've opened issues with Ibis for missing features on their end

let's ship it! 2 approval gifs here as this was an extra-large one

@MarcoGorelli MarcoGorelli merged commit bbb1c65 into narwhals-dev:main May 13, 2025
32 checks passed
@rwhitten577 rwhitten577 deleted the feature/initial-ibis-lazyframe branch May 13, 2025 14:43
@rwhitten577
Copy link
Contributor Author

Awesome! Thanks @dangotbanned and @MarcoGorelli for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ibis Issue is related to ibis backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request for contributions: Ibis support

6 participants