diff --git a/docs/src/main/sphinx/connector/delta-lake.rst b/docs/src/main/sphinx/connector/delta-lake.rst index 76aec22883eb..cff3669eeae6 100644 --- a/docs/src/main/sphinx/connector/delta-lake.rst +++ b/docs/src/main/sphinx/connector/delta-lake.rst @@ -35,8 +35,8 @@ The connector recognizes Delta tables created in the metastore by the Databricks runtime. If non-Delta tables are present in the metastore, as well, they are not visible to the connector. -To configure the Delta Lake connector, create a catalog properties file, for -example ``etc/catalog/delta.properties``, that references the ``delta-lake`` +To configure the Delta Lake connector, create a catalog properties file +``etc/catalog/example.properties`` that references the ``delta-lake`` connector. Update the ``hive.metastore.uri`` with the URI of your Hive metastore Thrift service: @@ -501,14 +501,14 @@ You can create a schema with the :doc:`/sql/create-schema` statement and the subdirectory under the schema location. Data files for tables in this schema using the default location are cleaned up if the table is dropped:: - CREATE SCHEMA delta.my_schema + CREATE SCHEMA example.example_schema WITH (location = 's3://my-bucket/a/path'); Optionally, the location can be omitted. Tables in this schema must have a location included when you create them. The data files for these tables are not removed if the table is dropped:: - CREATE SCHEMA delta.my_schema; + CREATE SCHEMA example.example_schema; .. _delta-lake-create-table: @@ -518,7 +518,7 @@ Creating tables When Delta tables exist in storage, but not in the metastore, Trino can be used to register them:: - CREATE TABLE delta.default.my_table ( + CREATE TABLE example.default.example_table ( dummy bigint ) WITH ( @@ -541,7 +541,7 @@ If the specified location does not already contain a Delta table, the connector automatically writes the initial transaction log entries and registers the table in the metastore. As a result, any Databricks engine can write to the table:: - CREATE TABLE delta.default.new_table (id bigint, address varchar); + CREATE TABLE example.default.new_table (id bigint, address varchar); The Delta Lake connector also supports creating tables using the :doc:`CREATE TABLE AS ` syntax. @@ -563,7 +563,7 @@ There are three table properties available for use in table creation. The following example uses all three table properties:: - CREATE TABLE delta.default.my_partitioned_table + CREATE TABLE example.default.example_partitioned_table WITH ( location = 's3://my-bucket/a/path', partitioned_by = ARRAY['regionkey'], @@ -581,7 +581,7 @@ The connector can register table into the metastore with existing transaction lo The ``system.register_table`` procedure allows the caller to register an existing delta lake table in the metastore, using its existing transaction logs and data files:: - CALL delta.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 's3://my-bucket/a/path') + CALL example.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 's3://my-bucket/a/path') To prevent unauthorized users from accessing data, this procedure is disabled by default. The procedure is enabled only when ``delta.register-table-procedure.enabled`` is set to ``true``. @@ -658,7 +658,7 @@ limit the amount of data used to generate the table statistics: .. code-block:: SQL - ANALYZE my_table WITH(files_modified_after = TIMESTAMP '2021-08-23 + ANALYZE example_table WITH(files_modified_after = TIMESTAMP '2021-08-23 16:43:01.321 Z') As a result, only files newer than the specified time stamp are used in the @@ -669,7 +669,7 @@ property: .. code-block:: SQL - ANALYZE my_table WITH(columns = ARRAY['nationkey', 'regionkey']) + ANALYZE example_table WITH(columns = ARRAY['nationkey', 'regionkey']) To run ``ANALYZE`` with ``columns`` more than once, the next ``ANALYZE`` must run on the same set or a subset of the original columns used. @@ -693,7 +693,7 @@ extended statistics for a specified table in a specified schema: .. code-block:: - CALL delta_catalog.system.drop_extended_stats('my_schema', 'my_table') + CALL example.system.drop_extended_stats('example_schema', 'example_table') Memory usage @@ -721,7 +721,7 @@ as follows: .. code-block:: shell - CALL mydeltacatalog.system.vacuum('myschemaname', 'mytablename', '7d'); + CALL example.system.vacuum('exampleschemaname', 'exampletablename', '7d'); All parameters are required, and must be presented in the following order: diff --git a/docs/src/main/sphinx/connector/druid.rst b/docs/src/main/sphinx/connector/druid.rst index 698e58491d67..11d004564a4e 100644 --- a/docs/src/main/sphinx/connector/druid.rst +++ b/docs/src/main/sphinx/connector/druid.rst @@ -25,8 +25,8 @@ Create a catalog properties file that specifies the Druid connector by setting the ``connector.name`` to ``druid`` and configuring the ``connection-url`` with the JDBC string to connect to Druid. -For example, to access a database as ``druid``, create the file -``etc/catalog/druid.properties``. Replace ``BROKER:8082`` with the correct +For example, to access a database as ``example``, create the file +``etc/catalog/example.properties``. Replace ``BROKER:8082`` with the correct host and port of your Druid broker. .. code-block:: properties @@ -43,7 +43,7 @@ secured by basic authentication by updating the URL and adding credentials: connection-user=root connection-password=secret -Now you can access your Druid database in Trino with the ``druiddb`` catalog +Now you can access your Druid database in Trino with the ``example`` catalog name from the properties file. The ``connection-user`` and ``connection-password`` are typically required and @@ -144,7 +144,7 @@ to split and then count the number of comma-separated values in a column:: num_reports FROM TABLE( - druid.system.query( + example.system.query( query => 'SELECT MV_LENGTH( STRING_TO_MV(direct_reports, ",") diff --git a/docs/src/main/sphinx/connector/elasticsearch.rst b/docs/src/main/sphinx/connector/elasticsearch.rst index 29494bd5f4ae..d5760ae6c50e 100644 --- a/docs/src/main/sphinx/connector/elasticsearch.rst +++ b/docs/src/main/sphinx/connector/elasticsearch.rst @@ -17,8 +17,8 @@ Configuration ------------- To configure the Elasticsearch connector, create a catalog properties file -``etc/catalog/elasticsearch.properties`` with the following contents, -replacing the properties as appropriate: +``etc/catalog/example.properties`` with the following contents, replacing the +properties as appropriate for your setup: .. code-block:: text @@ -462,7 +462,7 @@ documents in the ``orders`` index where the country name is ``ALGERIA``:: * FROM TABLE( - elasticsearch.system.raw_query( + example.system.raw_query( schema => 'sales', index => 'orders', query => '{ diff --git a/docs/src/main/sphinx/connector/hudi.rst b/docs/src/main/sphinx/connector/hudi.rst index 213f6f8b6fb4..4acfc00c6560 100644 --- a/docs/src/main/sphinx/connector/hudi.rst +++ b/docs/src/main/sphinx/connector/hudi.rst @@ -26,10 +26,10 @@ metastore configuration properties as the :doc:`Hive connector The connector recognizes Hudi tables synced to the metastore by the `Hudi sync tool `_. -To create a catalog that uses the Hudi connector, create a catalog properties file, -for example ``etc/catalog/example.properties``, that references the ``hudi`` -connector. Update the ``hive.metastore.uri`` with the URI of your Hive metastore -Thrift service: +To create a catalog that uses the Hudi connector, create a catalog properties +file ``etc/catalog/example.properties`` that references the ``hudi`` connector. +Update the ``hive.metastore.uri`` with the URI of your Hive metastore Thrift +service: .. code-block:: properties @@ -119,7 +119,7 @@ Here are some sample queries: .. code-block:: sql - USE a-catalog.myschema; + USE example.example_schema; SELECT symbol, max(ts) FROM stock_ticks_cow diff --git a/docs/src/main/sphinx/connector/iceberg.rst b/docs/src/main/sphinx/connector/iceberg.rst index befe97b6f49e..015b3b57f3fe 100644 --- a/docs/src/main/sphinx/connector/iceberg.rst +++ b/docs/src/main/sphinx/connector/iceberg.rst @@ -288,22 +288,22 @@ subdirectory under the directory corresponding to the schema location. Create a schema on S3:: - CREATE SCHEMA iceberg.my_s3_schema + CREATE SCHEMA example.example_s3_schema WITH (location = 's3://my-bucket/a/path/'); Create a schema on a S3 compatible object storage such as MinIO:: - CREATE SCHEMA iceberg.my_s3a_schema + CREATE SCHEMA example.example_s3a_schema WITH (location = 's3a://my-bucket/a/path/'); Create a schema on HDFS:: - CREATE SCHEMA iceberg.my_hdfs_schema + CREATE SCHEMA example.example_hdfs_schema WITH (location='hdfs://hadoop-master:9000/user/hive/warehouse/a/path/'); Optionally, on HDFS, the location can be omitted:: - CREATE SCHEMA iceberg.my_hdfs_schema; + CREATE SCHEMA example.example_hdfs_schema; .. _iceberg-create-table: @@ -314,7 +314,7 @@ The Iceberg connector supports creating tables using the :doc:`CREATE TABLE ` syntax. Optionally specify the :ref:`table properties ` supported by this connector:: - CREATE TABLE my_table ( + CREATE TABLE example_table ( c1 integer, c2 date, c3 double @@ -363,7 +363,7 @@ The Iceberg connector supports setting ``NOT NULL`` constraints on the table col The ``NOT NULL`` constraint can be set on the columns, while creating tables by using the :doc:`CREATE TABLE ` syntax:: - CREATE TABLE my_table ( + CREATE TABLE example_table ( year INTEGER NOT NULL, name VARCHAR NOT NULL, age INTEGER, @@ -542,7 +542,7 @@ partitioning columns, that can match entire partitions. Given the table definiti from :ref:`Partitioned Tables ` section, the following SQL statement deletes all partitions for which ``country`` is ``US``:: - DELETE FROM iceberg.testdb.customer_orders + DELETE FROM example.testdb.customer_orders WHERE country = 'US' A partition delete is performed if the ``WHERE`` clause meets these conditions. @@ -704,7 +704,7 @@ Transform Description In this example, the table is partitioned by the month of ``order_date``, a hash of ``account_number`` (with 10 buckets), and ``country``:: - CREATE TABLE iceberg.testdb.customer_orders ( + CREATE TABLE example.testdb.customer_orders ( order_id BIGINT, order_date DATE, account_number BIGINT, @@ -724,7 +724,7 @@ For example, you could find the snapshot IDs for the ``customer_orders`` table by running the following query:: SELECT snapshot_id - FROM iceberg.testdb."customer_orders$snapshots" + FROM example.testdb."customer_orders$snapshots" ORDER BY committed_at DESC Time travel queries @@ -739,7 +739,7 @@ snapshot identifier corresponding to the version of the table that needs to be retrieved:: SELECT * - FROM iceberg.testdb.customer_orders FOR VERSION AS OF 8954597067493422955 + FROM example.testdb.customer_orders FOR VERSION AS OF 8954597067493422955 A different approach of retrieving historical data is to specify a point in time in the past, such as a day or week ago. The latest snapshot @@ -747,7 +747,7 @@ of the table taken before or at the specified timestamp in the query is internally used for providing the previous state of the table:: SELECT * - FROM iceberg.testdb.customer_orders FOR TIMESTAMP AS OF TIMESTAMP '2022-03-23 09:59:29.803 Europe/Vienna' + FROM example.testdb.customer_orders FOR TIMESTAMP AS OF TIMESTAMP '2022-03-23 09:59:29.803 Europe/Vienna' Rolling back to a previous snapshot ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -755,13 +755,13 @@ Rolling back to a previous snapshot Use the ``$snapshots`` metadata table to determine the latest snapshot ID of the table like in the following query:: SELECT snapshot_id - FROM iceberg.testdb."customer_orders$snapshots" + FROM example.testdb."customer_orders$snapshots" ORDER BY committed_at DESC LIMIT 1 The procedure ``system.rollback_to_snapshot`` allows the caller to roll back the state of the table to a previous snapshot id:: - CALL iceberg.system.rollback_to_snapshot('testdb', 'customer_orders', 8954597067493422955) + CALL example.system.rollback_to_snapshot('testdb', 'customer_orders', 8954597067493422955) Schema evolution ---------------- @@ -781,14 +781,14 @@ The procedure ``system.register_table`` allows the caller to register an existing Iceberg table in the metastore, using its existing metadata and data files:: - CALL iceberg.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44') + CALL example.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44') In addition, you can provide a file name to register a table with specific metadata. This may be used to register the table with some specific table state, or may be necessary if the connector cannot automatically figure out the metadata version to use:: - CALL iceberg.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44', metadata_file_name => '00003-409702ba-4735-4645-8f14-09537cc0b2c8.metadata.json') + CALL example.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44', metadata_file_name => '00003-409702ba-4735-4645-8f14-09537cc0b2c8.metadata.json') To prevent unauthorized users from accessing data, this procedure is disabled by default. The procedure is enabled only when ``iceberg.register-table-procedure.enabled`` is set to ``true``. @@ -835,7 +835,7 @@ Property name Description ================================================== ================================================================ The table definition below specifies format Parquet, partitioning by columns ``c1`` and ``c2``, -and a file system location of ``/var/my_tables/test_table``:: +and a file system location of ``/var/example_tables/test_table``:: CREATE TABLE test_table ( c1 integer, @@ -844,10 +844,10 @@ and a file system location of ``/var/my_tables/test_table``:: WITH ( format = 'PARQUET', partitioning = ARRAY['c1', 'c2'], - location = '/var/my_tables/test_table') + location = '/var/example_tables/test_table') The table definition below specifies format ORC, bloom filter index by columns ``c1`` and ``c2``, -fpp is 0.05, and a file system location of ``/var/my_tables/test_table``:: +fpp is 0.05, and a file system location of ``/var/example_tables/test_table``:: CREATE TABLE test_table ( c1 integer, @@ -855,7 +855,7 @@ fpp is 0.05, and a file system location of ``/var/my_tables/test_table``:: c3 double) WITH ( format = 'ORC', - location = '/var/my_tables/test_table', + location = '/var/example_tables/test_table', orc_bloom_filter_columns = ARRAY['c1', 'c2'], orc_bloom_filter_fpp = 0.05) @@ -876,18 +876,18 @@ can be selected directly, or used in conditional statements. For example, you can inspect the file path for each record:: SELECT *, "$path", "$file_modified_time" - FROM iceberg.web.page_views; + FROM example.web.page_views; Retrieve all records that belong to a specific file using ``"$path"`` filter:: SELECT * - FROM iceberg.web.page_views + FROM example.web.page_views WHERE "$path" = '/usr/iceberg/table/web.page_views/data/file_01.parquet' Retrieve all records that belong to a specific file using ``"$file_modified_time"`` filter:: SELECT * - FROM iceberg.web.page_views + FROM example.web.page_views WHERE "$file_modified_time" = CAST('2022-07-01 01:02:03.456 UTC' AS timestamp with time zone) .. _iceberg-metadata-tables: diff --git a/docs/src/main/sphinx/connector/table-redirection.fragment b/docs/src/main/sphinx/connector/table-redirection.fragment index ef8928fb3a2f..eb743737aa5c 100644 --- a/docs/src/main/sphinx/connector/table-redirection.fragment +++ b/docs/src/main/sphinx/connector/table-redirection.fragment @@ -10,9 +10,9 @@ Therefore, a metastore database can hold a variety of tables with different tabl As a concrete example, let's use the following simple scenario which makes use of table redirection:: - USE a-catalog.myschema; + USE example.example_schema; - EXPLAIN SELECT * FROM mytable; + EXPLAIN SELECT * FROM example_table; .. code-block:: text @@ -22,16 +22,16 @@ simple scenario which makes use of table redirection:: ... Output[columnNames = [...]] │ ... - └─ TableScan[table = another-catalog:myschema:mytable] + └─ TableScan[table = another_catalog:example_schema:example_table] ... The output of the ``EXPLAIN`` statement points out the actual -catalog which is handling the ``SELECT`` query over the table ``mytable``. +catalog which is handling the ``SELECT`` query over the table ``example_table``. The table redirection functionality works also when using fully qualified names for the tables:: - EXPLAIN SELECT * FROM a-catalog.myschema.mytable; + EXPLAIN SELECT * FROM example.example_schema.example_table; .. code-block:: text @@ -41,7 +41,7 @@ fully qualified names for the tables:: ... Output[columnNames = [...]] │ ... - └─ TableScan[table = another-catalog:myschema:mytable] + └─ TableScan[table = another_catalog:example_schema:example_table] ... Trino offers table redirection support for the following operations: