Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
249 changes: 249 additions & 0 deletions _posts/2025-10-30-arrow-rs-57.0.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
---
layout: post
title: "Apache Arrow Rust 57.0.0 Release"
date: "2025-10-30 00:00:00"
author: pmc
categories: [release]
---
<!--
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to you under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
{% endcomment %}
-->

The Apache Arrow team is pleased to announce that the v57.0.0 release of Apache Arrow
Rust is now available on crates.io ([arrow] and [parquet]) and as [source download].

[arrow]: https://crates.io/crates/arrow
[parquet]: https://crates.io/crates/parquet
[source download]: https://dist.apache.org/repos/dist/release/arrow/arrow-rs-57.0.0

See the [57.0.0 changelog] for a full list of changes.

[57.0.0 changelog]: https://github.com/apache/arrow-rs/blob/57.0.0/CHANGELOG.md


## New Features

Note: Arrow Rust hosts the development of the [parquet] crate, a high
performance Rust implementation of [Apache Parquet].

### Performance: 4x Faster Parquet Metadata Parsing 🚀

Ed Seidl ([@etseidl]) and Jörn Horstmann ([@jhorstmann]) contributed a rewritten
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @etseidl and @jhorstmann, your names are in lights

thrift metadata parser for Parquet files which is almost 4x faster than the
previous parser based on the `thrift` crate. This is especially exciting for low
latency use cases and reading Parquet files with large amounts of metadata (e.g.
many row groups or columns).
See the [blog post about the new Parquet metadata parser] for more details.

<div style="display: flex; gap: 16px; justify-content: center; align-items: flex-start;">
<img src="{{ site.baseurl }}/img/rust-parquet-metadata/results.png" width="100%" class="img-responsive" alt="" aria-hidden="true">
</div>

*Figure 1:* Performance improvements of [Apache Parquet] metadata parsing between version `56.2.0` and `57.0.0`.


[Apache Parquet]: https://parquet.apache.org/
[@etseidl]: https://github.com/etseidl
[@jhorstmann]: https://github.com/jhorstmann

[blog post about the new Parquet metadata parser]: https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/

### New `arrow-avro` Crate

The `57.0.0` release introduces a new [`arrow-avro`] crate contributed by [@jecsand838]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and [@nathaniel-d-ef] that provides much more efficient conversion between
[Apache Avro](https://avro.apache.org/) and Arrow `RecordBatch`es, as well as broader feature support.

Previously, Arrow‑based systems that read or wrote Avro data
typically used the general‑purpose [apache-avro] crate. While mature and
feature‑complete, its row-oriented API does not support features such as
projection pushdown or vectorized execution. The new `arrow-avro` crate supports
these features efficiently by converting Avro data directly into Arrow's
columnar format.

See the [blog post about adding arrow-avro] for more details.

<div style="display: flex; gap: 16px; justify-content: center; align-items: flex-start; padding: 20px 15px;">
<img src="{{ site.baseurl }}/img/introducing-arrow-avro/arrow-avro-architecture.svg"
width="100%"
alt="High-level `arrow-avro` architecture"
style="background:#fff">
</div>

*Figure 2:* Architecture of the `arrow-avro` crate.


[@jecsand838]: https://github.com/jecsand838
[@nathaniel-d-ef]: https://github.com/nathaniel-d-ef
[apache-avro]: https://crates.io/crates/apache-avro
[`arrow-avro`]: https://crates.io/crates/arrow-avro

[blog post about adding arrow-avro]: https://arrow.apache.org/blog/2025/10/23/introducing-arrow-avro/


### Parquet Variant Support 🧬

The Apache Parquet project recently added a [new `Variant` type] for
representing semi-structured data. The `57.0.0` release includes support for reading and
writing both normal and shredded `Variant` values to and from Parquet files. It
also includes [parquet-variant], a complete library for working with `Variant`
values, [`VariantArray`] for working with arrays of `Variant` values in Apache
Arrow, computation kernels for converting to/from JSON and Arrow types,
extracting paths, and shredding values.

[new `Variant` type]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
[`VariantArray`]: https://docs.rs/parquet/latest/parquet/variant/struct.VariantArray.html
[parquet-variant]: https://crates.io/crates/parquet-variant

```rust
// Use the VariantArrayBuilder to build a VariantArray
Comment on lines +113 to +114

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add imports to make this runnable?

let mut builder = VariantArrayBuilder::new(3);
builder.new_object().with_field("name", "Alice").finish(); // row 1: {"name": "Alice"}
builder.append_value("such wow"); // row 2: "such wow" (a string)
let array = builder.build();

// Since VariantArray is an ExtensionType, it needs to be converted
// to an ArrayRef and Field with the appropriate metadata
// before it can be written to a Parquet file
let field = array.field("data");
let array = ArrayRef::from(array);
// create a RecordBatch with the VariantArray
let schema = Schema::new(vec![field]);
let batch = RecordBatch::try_new(Arc::new(schema), vec![array])?;

// Now you can write the RecordBatch to the Parquet file, as normal
let file = std::fs::File::create("variant.parquet")?;
let mut writer = ArrowWriter::try_new(file, batch.schema(), None)?;
writer.write(&batch)?;
writer.close()?;
```


This support is being integrated into query engines, such as
[@friendlymatthew]'s [`datafusion-variant`] crate to integrate into DataFusion

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and [delta-rs]. While this support is still experimental, we believe the APIs
are mostly complete and do not expect major changes. Please consider trying
it out and providing feedback and improvements.

[`datafusion-variant`]: https://github.com/datafusion-contrib/datafusion-variant
[delta-rs]: https://github.com/delta-io/delta-rs/issues/3637

Thanks to the many contributors who made this possible, including:
* Ryan Johnson ([@scovich]), Congxian Qiu ([@klion26]), and Liam Bao ([@liamzwbao]) for completing the implementation
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Li Jiaying ([@PinkCrow007]), Aditya Bhatnagar ([@carpecodeum]), and Malthe Karbo ([@mkarbo]) for
initiating the work
* Everyone else who has contributed, including [@superserious-dev], [@friendlymatthew], [@micoo227], [@Weijun-H],
[@harshmotw-db], [@odysa], [@viirya], [@adriangb], [@kosiew], [@codephage2020],
[@ding-young], [@mbrobbel], [@petern48], [@sdf-jkl], [@abacef], and [@mprammer].

[@PinkCrow007]: https://github.com/PinkCrow007
[@mkarbo]: https://github.com/mkarbo
[@carpecodeum]: https://github.com/carpecodeum
[@scovich]: https://github.com/scovich
[@superserious-dev]: https://github.com/superserious-dev
[@friendlymatthew]: https://github.com/friendlymatthew
[@micoo227]: https://github.com/micoo227
[@Weijun-H]: https://github.com/Weijun-H
[@harshmotw-db]: https://github.com/harshmotw-db
[@odysa]: https://github.com/odysa
[@viirya]: https://github.com/viirya
[@klion26]: https://github.com/klion26
[@adriangb]: https://github.com/adriangb
[@kosiew]: https://github.com/kosiew
[@liamzwbao]: https://github.com/liamzwbao
[@codephage2020]: https://github.com/codephage2020
[@ding-young]: https://github.com/ding-young
[@mbrobbel]: https://github.com/mbrobbel
[@petern48]: https://github.com/petern48
[@sdf-jkl]: https://github.com/sdf-jkl
[@abacef]: https://github.com/abacef
[@mprammer]: https://github.com/mprammer

See the ticket [Variant type support in Parquet #6736] for more details


[Variant type support in Parquet #6736]: https://github.com/apache/arrow-rs/issues/6736


### Parquet Geometry Support 🗺️


The `57.0.0` release also includes support for reading and writing [Parquet Geometry
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

types], `GEOMETRY` and `GEOGRAPHY`, including `GeospatialStatistics`
contributed by Kyle Barron ([@kylebarron]), Dewey Dunnington ([@paleolimbot]),
Kaushik Srinivasan ([@kaushiksrini]), and Blake Orth ([@BlakeOrth]).

Please see the [Implement Geometry and Geography type support in Parquet] tracking ticket for more details.

[@kylebarron]: https://github.com/kylebarron
[@paleolimbot]: https://github.com/paleolimbot
[@kaushiksrini]: https://github.com/kaushiksrini
[@BlakeOrth]: https://github.com/BlakeOrth

[Parquet Geometry types]: https://github.com/apache/parquet-format/blob/master/Geospatial.md


[Implement Geometry and Geography type support in Parquet]: https://github.com/apache/arrow-rs/issues/8373

## Thanks to Our Contributors
```console
$ git shortlog -sn 56.0.0..57.0.0
36 Matthijs Brobbel
20 Andrew Lamb
13 Ryan Johnson
11 Ed Seidl
10 Connor Sanders
8 Alex Huang
5 Emil Ernerfeldt
5 Liam Bao
5 Matthew Kim
4 nathaniel-d-ef
3 Raz Luvaton
3 albertlockett
3 dependabot[bot]
3 mwish
2 Ben Ye
2 Congxian Qiu
2 Dewey Dunnington
2 Kyle Barron
2 Lilian Maurel
2 Mark Nash
2 Nuno Faria
2 Pepijn Van Eeckhoudt
2 Tobias Schwarzinger
2 lichuang
1 Adam Gutglick
1 Adam Reeve
1 Alex Stephen
1 Chen Chongchen
1 Jack
1 Jeffrey Vo
1 Jörn Horstmann
1 Kaushik Srinivasan
1 Li Jiaying
1 Lin Yihai
1 Marco Neumann
1 Piotr Findeisen
1 Piotr Srebrny
1 Samuele Resca
1 Van De Bio
1 Yan Tingwang
1 ding-young
1 kosiew
1 张林伟
```