Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use arrow::MakeBuilder to create arrow array builder that exactly match the desired datatype #4280

Merged
merged 4 commits into from
Nov 21, 2023

Conversation

Wumpf
Copy link
Member

@Wumpf Wumpf commented Nov 20, 2023

What

Previously, not knowing about arrow::MakeBuilder, we created the hierarchy of arrow array builders manually in a utility method dubbed new_arrow_array_builder. While working on C FFI it turned out that this way we end up with an arrow array whose data type doesn't have the correct nullability settings - in other words, on creating the arrow array builders we would have needed to pass the exact data types. This is what I tried at first and it worked, but then I noticed that there's already arrow::MakeBuilder which internally uses a bunch of rolled out switch/case statements to create the hierarchy programmatically.

It turns out that this in combination with the work done on switching from IPC to C FFI gives another significant speed-up. When tested in isolation however (i.e. still on IPC), the gains seem to be rather modest and within the (large!) measurement noise.

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested demo.rerun.io (if applicable)
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

@Wumpf Wumpf added 🚜 refactor Change the code, not the functionality 🌊 C++ API C/C++ API specific exclude from changelog PRs with this won't show up in CHANGELOG.md labels Nov 20, 2023
Copy link
Member

@teh-cmc teh-cmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@Wumpf Wumpf merged commit 16b4c3c into main Nov 21, 2023
42 checks passed
@Wumpf Wumpf deleted the andreas/cpp/fix-arrow-array-datatypes branch November 21, 2023 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🌊 C++ API C/C++ API specific exclude from changelog PRs with this won't show up in CHANGELOG.md 🚜 refactor Change the code, not the functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants