diff --git a/docs/src/main/sphinx/connector/hive.rst b/docs/src/main/sphinx/connector/hive.rst index b020ac8f9d3c..00061118a749 100644 --- a/docs/src/main/sphinx/connector/hive.rst +++ b/docs/src/main/sphinx/connector/hive.rst @@ -865,6 +865,8 @@ as Hive. For example, converting the string ``'foo'`` to a number, or converting the string ``'1234'`` to a ``tinyint`` (which has a maximum value of ``127``). +.. _hive_avro_schema: + Avro schema evolution --------------------- @@ -984,6 +986,109 @@ Procedures Flush Hive metadata cache entries connected with selected partition. Procedure requires named parameters to be passed +.. _hive_table_properties: + +Table properties +---------------- + +Table properties supply or set metadata for the underlying tables. This +is key for :doc:`/sql/create-table-as` statements. Table properties are passed +to the connector using a :doc:`WITH ` clause:: + + CREATE TABLE tablename + WITH (format='CSV', + csv_escape = '"') + +See the :ref:`hive_examples` for more information. + +.. list-table:: Hive connector table properties + :widths: 20, 60, 20 + :header-rows: 1 + + * - Property name + - Description + - Default + * - ``auto_purge`` + - Indicates to the configured metastore to perform a purge when a table or + partition is deleted instead of a soft deletion using the trash. + - + * - ``avro_schema_url`` + - The URI pointing to :ref:`hive_avro_schema` for the table. + - + * - ``bucket_count`` + - The number of buckets to group data into. Only valid if used with + ``bucketed_by``. + - 0 + * - ``bucketed_by`` + - The bucketing column for the storage table. Only valid if used with + ``bucket_count``. + - ``[]`` + * - ``bucketing_version`` + - Specifies which Hive bucketing version to use. Valid values are ``1`` + or ``2``. + - + * - ``csv_escape`` + - The CSV escape character. Requires CSV format. + - + * - ``csv_quote`` + - The CSV quote character. Requires CSV format. + - + * - ``csv_separator`` + - The CSV separator character. Requires CSV format. + - + * - ``external_location`` + - The URI for an external Hive table on S3, Azure Blob Storage, etc. See the + :ref:`hive_examples` for more information. + - + * - ``format`` + - The table file format. Valid values include ``ORC``, ``PARQUET``, ``AVRO``, + ``RCBINARY``, ``RCTEXT``, ``SEQUENCEFILE``, ``JSON``, ``TEXTFILE``, and + ``CSV``. The catalog property ``hive.storage-format`` sets the default + value and can change it to a different default. + - + * - ``null_format`` + - The serialization format for ``NULL`` value. Requires TextFile, RCText, + or SequenceFile format. + - + * - ``orc_bloom_filter_columns`` + - Comma separated list of columns to use for ORC bloom filter. It improves + the performance of queries using range predicates when reading ORC files. + Requires ORC format. + - ``[]`` + * - ``orc_bloom_filter_fpp`` + - The ORC bloom filters false positive probability. Requires ORC format. + - 0.05 + * - ``partitioned_by`` + - The partitioning column for the storage table. The columns listed in the + ``partitioned_by`` clause must be the last columns as defined in the DDL. + - ``[]`` + * - ``skip_footer_line_count`` + - The number of footer lines to ignore when parsing the file for data. + Requires TextFile or CSV format tables. + - + * - ``skip_header_line_count`` + - The number of header lines to ignore when parsing the file for data. + Requires TextFile or CSV format tables. + - + * - ``sorted_by`` + - The column to sort by to determine bucketing for row. Only valid if + ``bucketed_by`` and ``bucket_count`` are specified as well. + - ``[]`` + * - ``textfile_field_separator`` + - Allows the use of custom field separators, such as '|', for TextFile + formatted tables. + - + * - ``textfile_field_separator_escape`` + - Allows the use of a custom escape character for TextFile formatted tables. + - + * - ``transactional`` + - Set this property to ``true`` to create an ORC ACID transactional table. + Requires ORC format. This property may be shown as true for insert-only + tables created using older versions of Hive. + - + +.. _hive_special_columns: + Special columns --------------- @@ -1014,11 +1119,10 @@ Retrieve all records that belong to files stored in the partition FROM hive.web.page_views WHERE "$partition" = 'ds=2016-08-09/country=US' -Special tables ----------------- +.. _hive_special_tables: -Table properties -^^^^^^^^^^^^^^^^ +Special tables +-------------- The raw Hive table properties are available as a hidden table, containing a separate column per table property, with a single row containing the property @@ -1029,6 +1133,8 @@ You can inspect the property names and values with a simple query:: SELECT * FROM hive.web."page_views$properties"; +.. _hive_examples: + Examples --------