Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage dictionary-encode when turning a scalar columnar value into an array #11503

Open
doki23 opened this issue Jul 17, 2024 · 3 comments
Open
Labels
enhancement New feature or request

Comments

@doki23
Copy link
Contributor

doki23 commented Jul 17, 2024

Is your feature request related to a problem or challenge?

We have a into_array function of ColumnarValue which converts it into an arrow array like this:

pub fn into_array(self, num_rows: usize) -> Result<ArrayRef> {
      Ok(match self {
          ColumnarValue::Array(array) => array,
          ColumnarValue::Scalar(scalar) => scalar.to_array_of_size(num_rows)?,
      })
  }

If this column is of an Int32 type, it returns an Int32Array.
Given that the ColumnarValue::Scalar's cardinality is 1, in some special cases, we can turn it into a dictionary array in order to obtain some performance gains.

Describe the solution you'd like

For example, if we want to turn a scalar value of f64 type into an array of size 256, we can return a dictionary f64 array of uint8 key type.

Describe alternatives you've considered

No response

Additional context

No response

@doki23 doki23 added the enhancement New feature or request label Jul 17, 2024
@alamb
Copy link
Contributor

alamb commented Jul 18, 2024

I agree this is a good idea

It may be easier to take advantage of after @notfilippo 's proposal for logical types #11513

@doki23
Copy link
Contributor Author

doki23 commented Jul 19, 2024

It may be easier to take advantage of after @notfilippo 's proposal for logical types #11513

Yes, I also feel a little bit weird that Dictionary is a type of arrow. Actually, it's more like an encoding/layout than a data type, right?

@alamb
Copy link
Contributor

alamb commented Jul 19, 2024

Update here is there is a more fully fleshed out proposal for logical types: #11513

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants