Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maps cast to other Maps with different Elements, Key and Value Names #5702

Closed
HawaiianSpork opened this issue Apr 29, 2024 · 1 comment · Fixed by #5703
Closed

Maps cast to other Maps with different Elements, Key and Value Names #5702

HawaiianSpork opened this issue Apr 29, 2024 · 1 comment · Fixed by #5703
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@HawaiianSpork
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Arrow has field names for list elements and map elements. arrow-rs can cast lists with one name for elements (like elements) to another name like `items). But this is not supported for the map type which has field names, one for the elements, one for the keys and one for the values. This can be a limitation for anyone using arrays of Maps from different sources.

Describe the solution you'd like
Maps to mimic the behavior of lists where casting would be supported where the elements of the maps contain the same structure but the names for the elements, keys and values of the fields may be different. Keys are always considered the first field according the arrow spec and values are considered the second field.

Describe alternatives you've considered
Some alternatives would be:

  1. Rather than using the field names encoded in files like parquet for lists and maps, always convert to the same field when reading into an arrow record batch. There would be a small loss of information using this approach as what the field names before writing to Parquet would be lost, not sure how much this matters in practice. The proposed solution also does not prevent this approach from being implemented in the future.
  2. Require third parties outside of arrow core to do the casting. This allows the consumer of arrow-rs to control what type of casting would be supported based on their needs. But in most cases consumers of arrow-rs will want the same type of casting.
@HawaiianSpork HawaiianSpork added the enhancement Any new improvement worthy of a entry in the changelog label Apr 29, 2024
@alamb alamb added the arrow Changes to the arrow crate label Jul 2, 2024
@alamb
Copy link
Contributor

alamb commented Jul 2, 2024

label_issue.py automatically added labels {'arrow'} from #5703

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
2 participants