From b1587c1149c5a94e8acfa3c0d8d623be4fd2f37c Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Tue, 3 Oct 2023 06:01:09 -0400 Subject: [PATCH] Document crate feature flags (#7713) * Document crate feature flags * prettier --- README.md | 28 +++++++++++++++++++++++-- datafusion/common/src/pyarrow.rs | 2 +- docs/source/user-guide/example-usage.md | 4 ---- 3 files changed, 27 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index ccb527a1f977..63da8c1c1a96 100644 --- a/README.md +++ b/README.md @@ -35,9 +35,33 @@ Here are links to some important information - [Python DataFrame API](https://arrow.apache.org/datafusion-python/) - [Architecture](https://docs.rs/datafusion/latest/datafusion/index.html#architecture) -## Building your project with DataFusion +## What can you do with this crate? -DataFusion is great for building projects and products like SQL interfaces, time series platforms, and domain specific query engines. [Click Here](https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users) to see a list known users. +DataFusion is great for building projects such as domain specific query engines, new database platforms and data pipelines, query languages and more. +It lets you start quickly from a fully working engine, and then customize those features specific to your use. [Click Here](https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users) to see a list known users. + +## Crate features + +Default features: + +- `compression`: reading files compressed with `xz2`, `bzip2`, `flate2`, and `zstd` +- `crypto_expressions`: cryptographic functions such as `md5` and `sha256` +- `encoding_expressions`: `encode` and `decode` functions +- `regex_expressions`: regular expression functions, such as `regexp_match` +- `unicode_expressions`: Include unicode aware functions such as `character_length` + +Optional features: + +- `avro`: support for reading the [Apache Avro] format +- `backtrace`: include backtrace information in error messages +- `pyarrow`: conversions between PyArrow and DataFusion types +- `simd`: enable arrow-rs's manual `SIMD` kernels (requires Rust `nightly`) + +[apache avro]: https://avro.apache.org/ + +## Rust Version Compatibility + +This crate is tested with the latest stable version of Rust. We do not currently test against other, older versions of the Rust compiler. ## Contributing to DataFusion diff --git a/datafusion/common/src/pyarrow.rs b/datafusion/common/src/pyarrow.rs index d18782e037ae..d78aa8b988f7 100644 --- a/datafusion/common/src/pyarrow.rs +++ b/datafusion/common/src/pyarrow.rs @@ -15,7 +15,7 @@ // specific language governing permissions and limitations // under the License. -//! PyArrow +//! Conversions between PyArrow and DataFusion types use arrow::array::ArrayData; use arrow::pyarrow::{FromPyArrow, ToPyArrow}; diff --git a/docs/source/user-guide/example-usage.md b/docs/source/user-guide/example-usage.md index adaf780558bc..c631d552dd73 100644 --- a/docs/source/user-guide/example-usage.md +++ b/docs/source/user-guide/example-usage.md @@ -187,10 +187,6 @@ DataFusion is designed to be extensible at all points. To that end, you can prov - [x] User Defined `LogicalPlan` nodes - [x] User Defined `ExecutionPlan` nodes -## Rust Version Compatibility - -This crate is tested with the latest stable version of Rust. We do not currently test against other, older versions of the Rust compiler. - ## Optimized Configuration For an optimized build several steps are required. First, use the below in your `Cargo.toml`. It is