Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 25 additions & 7 deletions docs/src/main/sphinx/connector/delta-lake.rst
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,8 @@ values. Typical usage does not require you to configure them.
* ``SNAPPY``
* ``ZSTD``
* ``GZIP``

The equivalent catalog session property is ``compression_codec``.
- ``SNAPPY``
* - ``delta.max-partitions-per-writer``
- Maximum number of partitions per writer.
Expand Down Expand Up @@ -134,15 +136,24 @@ values. Typical usage does not require you to configure them.
* - ``delta.dynamic-filtering.wait-timeout``
- Duration to wait for completion of :doc:`dynamic filtering
</admin/dynamic-filtering>` during split generation.
The equivalent catalog session property is
``dynamic_filtering_wait_timeout``.
-
* - ``delta.table-statistics-enabled``
- Enables :ref:`Table statistics <delta-lake-table-statistics>` for
performance improvements.
performance improvements. The equivalent catalog session property
is ``statistics_enabled``.
- ``true``
* - ``delta.extended-statistics.enabled``
- Enable statistics collection with :doc:`/sql/analyze` and
use of extended statistics.
- ``false``
use of extended statistics. The equivalent catalog session property
is ``extended_statistics_enabled``.
- ``true``
* - ``delta.extended-statistics.collect-on-write``
- Enable collection of extended statistics for write operations.
The equivalent catalog session property is
``extended_statistics_collect_on_write``.
- ``true``
* - ``delta.per-transaction-metastore-cache-maximum-size``
- Maximum number of metastore data objects per transaction in
the Hive metastore cache.
Expand All @@ -156,6 +167,7 @@ values. Typical usage does not require you to configure them.
- JVM default
* - ``delta.target-max-file-size``
- Target maximum size of written files; the actual size could be larger.
The equivalent catalog session property is ``target_max_file_size``.
- ``1GB``
* - ``delta.unique-table-location``
- Use randomized, unique table locations.
Expand All @@ -166,16 +178,17 @@ values. Typical usage does not require you to configure them.
* - ``delta.vacuum.min-retention``
- Minimum retention threshold for the files taken into account
for removal by the :ref:`VACUUM<delta-lake-vacuum>` procedure.
The equivalent catalog session property is
``vacuum_min_retention``.
- ``7 DAYS``

Catalog session properties
^^^^^^^^^^^^^^^^^^^^^^^^^^

The following table describes :ref:`catalog session properties
<session-properties-definition>` supported by the Delta Lake connector to
configure processing of Parquet files.
<session-properties-definition>` supported by the Delta Lake connector:

.. list-table:: Parquet catalog session properties
.. list-table:: Catalog session properties
:widths: 40, 60, 20
:header-rows: 1

Expand Down Expand Up @@ -1091,7 +1104,8 @@ connector.
with highly skewed aggregations or joins.
- ``0.05``
* - ``parquet.max-read-block-row-count``
- Sets the maximum number of rows read in a batch.
- Sets the maximum number of rows read in a batch. The equivalent catalog
session property is ``parquet_max_read_block_row_count``.
- ``8192``
* - ``parquet.optimized-reader.enabled``
- Specifies whether batched column readers are used when reading Parquet
Expand All @@ -1106,6 +1120,10 @@ connector.
for structural data types. The equivalent catalog session property is
``parquet_optimized_nested_reader_enabled``.
- ``true``
* - ``parquet.use-column-index``
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant commit 6eb42f2
cc @raunaqmorarka

- Skip reading Parquet pages by using Parquet column indices. The equivalent
catalog session property is ``parquet_use_column_index``.
- ``true``
* - ``delta.projection-pushdown-enabled``
- Read only projected fields from row columns while performing ``SELECT`` queries
- ``true``