Possible to factor out the null bytemap and/or use more of arrow compute API? #191

davesque · 2020-10-07T21:46:40Z

I noticed that fletcher converts the null bitmap into a null bytemap as a step in many computations for arrays that have null values. Do you have any interest in eventually factoring this step out or accepting PRs that do? I think that would involve a fair bit of custom Cython or Numba code that manually iterates over the null bitmap along with the values buffer. But it might be worth doing and could narrow the gap or even overtake Pandas on some of the benchmarks in your benchmarking suite.

Also, I noticed a number of other places where it might be possible to make simple calls to the Arrow compute API. I made a simple modification to the FletcherBaseArray.sum method to just make a direct call to pyarrow.compute.sum. This does make it so that you can't specify any special behavior regarding nulls via skipna. However, it speeds things up by a lot (35-40% faster than Pandas or Fletcher). It makes me wonder if it wouldn't be worth implementing more of Fletcher's internals via Cython and Arrow's compute API.

What are your thoughts on these things?

The text was updated successfully, but these errors were encountered:

xhochy · 2023-02-22T15:15:16Z

This project has been archived as development has ceased around 2021.
With the support of Apache Arrow-backed extension arrays in pandas, the major goal of this project has been fulfilled.

xhochy closed this as completed Feb 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible to factor out the null bytemap and/or use more of arrow compute API? #191

Possible to factor out the null bytemap and/or use more of arrow compute API? #191

davesque commented Oct 7, 2020

xhochy commented Feb 22, 2023

Possible to factor out the null bytemap and/or use more of arrow compute API? #191

Possible to factor out the null bytemap and/or use more of arrow compute API? #191

Comments

davesque commented Oct 7, 2020

xhochy commented Feb 22, 2023