Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions docs/src/main/sphinx/connector/delta-lake.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@ The connector recognizes Delta tables created in the metastore by the Databricks
runtime. If non-Delta tables are present in the metastore, as well, they are not
visible to the connector.

To configure the Delta Lake connector, create a catalog properties file, for
example ``etc/catalog/delta.properties``, that references the ``delta-lake``
To configure the Delta Lake connector, create a catalog properties file
``etc/catalog/example.properties`` that references the ``delta-lake``
connector. Update the ``hive.metastore.uri`` with the URI of your Hive metastore
Thrift service:

Expand Down Expand Up @@ -501,14 +501,14 @@ You can create a schema with the :doc:`/sql/create-schema` statement and the
subdirectory under the schema location. Data files for tables in this schema
using the default location are cleaned up if the table is dropped::

CREATE SCHEMA delta.my_schema
CREATE SCHEMA example.example_schema
WITH (location = 's3://my-bucket/a/path');

Optionally, the location can be omitted. Tables in this schema must have a
location included when you create them. The data files for these tables are not
removed if the table is dropped::

CREATE SCHEMA delta.my_schema;
CREATE SCHEMA example.example_schema;

.. _delta-lake-create-table:

Expand All @@ -518,7 +518,7 @@ Creating tables
When Delta tables exist in storage, but not in the metastore, Trino can be used
to register them::

CREATE TABLE delta.default.my_table (
CREATE TABLE example.default.example_table (
dummy bigint
)
WITH (
Expand All @@ -541,7 +541,7 @@ If the specified location does not already contain a Delta table, the connector
automatically writes the initial transaction log entries and registers the table
in the metastore. As a result, any Databricks engine can write to the table::

CREATE TABLE delta.default.new_table (id bigint, address varchar);
CREATE TABLE example.default.new_table (id bigint, address varchar);

The Delta Lake connector also supports creating tables using the :doc:`CREATE
TABLE AS </sql/create-table-as>` syntax.
Expand All @@ -563,7 +563,7 @@ There are three table properties available for use in table creation.

The following example uses all three table properties::

CREATE TABLE delta.default.my_partitioned_table
CREATE TABLE example.default.example_partitioned_table
WITH (
location = 's3://my-bucket/a/path',
partitioned_by = ARRAY['regionkey'],
Expand All @@ -581,7 +581,7 @@ The connector can register table into the metastore with existing transaction lo
The ``system.register_table`` procedure allows the caller to register an existing delta lake
table in the metastore, using its existing transaction logs and data files::

CALL delta.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 's3://my-bucket/a/path')
CALL example.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 's3://my-bucket/a/path')

To prevent unauthorized users from accessing data, this procedure is disabled by default.
The procedure is enabled only when ``delta.register-table-procedure.enabled`` is set to ``true``.
Expand Down Expand Up @@ -658,7 +658,7 @@ limit the amount of data used to generate the table statistics:

.. code-block:: SQL

ANALYZE my_table WITH(files_modified_after = TIMESTAMP '2021-08-23
ANALYZE example_table WITH(files_modified_after = TIMESTAMP '2021-08-23
16:43:01.321 Z')

As a result, only files newer than the specified time stamp are used in the
Expand All @@ -669,7 +669,7 @@ property:

.. code-block:: SQL

ANALYZE my_table WITH(columns = ARRAY['nationkey', 'regionkey'])
ANALYZE example_table WITH(columns = ARRAY['nationkey', 'regionkey'])

To run ``ANALYZE`` with ``columns`` more than once, the next ``ANALYZE`` must
run on the same set or a subset of the original columns used.
Expand All @@ -693,7 +693,7 @@ extended statistics for a specified table in a specified schema:

.. code-block::

CALL delta_catalog.system.drop_extended_stats('my_schema', 'my_table')
CALL example.system.drop_extended_stats('example_schema', 'example_table')


Memory usage
Expand Down Expand Up @@ -721,7 +721,7 @@ as follows:

.. code-block:: shell

CALL mydeltacatalog.system.vacuum('myschemaname', 'mytablename', '7d');
CALL example.system.vacuum('exampleschemaname', 'exampletablename', '7d');

All parameters are required, and must be presented in the following order:

Expand Down
8 changes: 4 additions & 4 deletions docs/src/main/sphinx/connector/druid.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ Create a catalog properties file that specifies the Druid connector by setting
the ``connector.name`` to ``druid`` and configuring the ``connection-url`` with
the JDBC string to connect to Druid.

For example, to access a database as ``druid``, create the file
``etc/catalog/druid.properties``. Replace ``BROKER:8082`` with the correct
For example, to access a database as ``example``, create the file
``etc/catalog/example.properties``. Replace ``BROKER:8082`` with the correct
host and port of your Druid broker.

.. code-block:: properties
Expand All @@ -43,7 +43,7 @@ secured by basic authentication by updating the URL and adding credentials:
connection-user=root
connection-password=secret

Now you can access your Druid database in Trino with the ``druiddb`` catalog
Now you can access your Druid database in Trino with the ``example`` catalog
name from the properties file.

The ``connection-user`` and ``connection-password`` are typically required and
Expand Down Expand Up @@ -144,7 +144,7 @@ to split and then count the number of comma-separated values in a column::
num_reports
FROM
TABLE(
druid.system.query(
example.system.query(
query => 'SELECT
MV_LENGTH(
STRING_TO_MV(direct_reports, ",")
Expand Down
6 changes: 3 additions & 3 deletions docs/src/main/sphinx/connector/elasticsearch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ Configuration
-------------

To configure the Elasticsearch connector, create a catalog properties file
``etc/catalog/elasticsearch.properties`` with the following contents,
replacing the properties as appropriate:
``etc/catalog/example.properties`` with the following contents, replacing the
properties as appropriate for your setup:

.. code-block:: text

Expand Down Expand Up @@ -462,7 +462,7 @@ documents in the ``orders`` index where the country name is ``ALGERIA``::
*
FROM
TABLE(
elasticsearch.system.raw_query(
example.system.raw_query(
schema => 'sales',
index => 'orders',
query => '{
Expand Down
10 changes: 5 additions & 5 deletions docs/src/main/sphinx/connector/hudi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ metastore configuration properties as the :doc:`Hive connector
The connector recognizes Hudi tables synced to the metastore by the
`Hudi sync tool <https://hudi.apache.org/docs/syncing_metastore>`_.

To create a catalog that uses the Hudi connector, create a catalog properties file,
for example ``etc/catalog/example.properties``, that references the ``hudi``
connector. Update the ``hive.metastore.uri`` with the URI of your Hive metastore
Thrift service:
To create a catalog that uses the Hudi connector, create a catalog properties
file ``etc/catalog/example.properties`` that references the ``hudi`` connector.
Update the ``hive.metastore.uri`` with the URI of your Hive metastore Thrift
service:

.. code-block:: properties

Expand Down Expand Up @@ -119,7 +119,7 @@ Here are some sample queries:

.. code-block:: sql

USE a-catalog.myschema;
USE example.example_schema;

SELECT symbol, max(ts)
FROM stock_ticks_cow
Expand Down
44 changes: 22 additions & 22 deletions docs/src/main/sphinx/connector/iceberg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -288,22 +288,22 @@ subdirectory under the directory corresponding to the schema location.

Create a schema on S3::

CREATE SCHEMA iceberg.my_s3_schema
CREATE SCHEMA example.example_s3_schema
WITH (location = 's3://my-bucket/a/path/');

Create a schema on a S3 compatible object storage such as MinIO::

CREATE SCHEMA iceberg.my_s3a_schema
CREATE SCHEMA example.example_s3a_schema
WITH (location = 's3a://my-bucket/a/path/');

Create a schema on HDFS::

CREATE SCHEMA iceberg.my_hdfs_schema
CREATE SCHEMA example.example_hdfs_schema
WITH (location='hdfs://hadoop-master:9000/user/hive/warehouse/a/path/');

Optionally, on HDFS, the location can be omitted::

CREATE SCHEMA iceberg.my_hdfs_schema;
CREATE SCHEMA example.example_hdfs_schema;

.. _iceberg-create-table:

Expand All @@ -314,7 +314,7 @@ The Iceberg connector supports creating tables using the :doc:`CREATE
TABLE </sql/create-table>` syntax. Optionally specify the
:ref:`table properties <iceberg-table-properties>` supported by this connector::

CREATE TABLE my_table (
CREATE TABLE example_table (
c1 integer,
c2 date,
c3 double
Expand Down Expand Up @@ -363,7 +363,7 @@ The Iceberg connector supports setting ``NOT NULL`` constraints on the table col
The ``NOT NULL`` constraint can be set on the columns, while creating tables by
using the :doc:`CREATE TABLE </sql/create-table>` syntax::

CREATE TABLE my_table (
CREATE TABLE example_table (
year INTEGER NOT NULL,
name VARCHAR NOT NULL,
age INTEGER,
Expand Down Expand Up @@ -542,7 +542,7 @@ partitioning columns, that can match entire partitions. Given the table definiti
from :ref:`Partitioned Tables <iceberg-tables>` section,
the following SQL statement deletes all partitions for which ``country`` is ``US``::

DELETE FROM iceberg.testdb.customer_orders
DELETE FROM example.testdb.customer_orders
WHERE country = 'US'

A partition delete is performed if the ``WHERE`` clause meets these conditions.
Expand Down Expand Up @@ -704,7 +704,7 @@ Transform Description
In this example, the table is partitioned by the month of ``order_date``, a hash of
``account_number`` (with 10 buckets), and ``country``::

CREATE TABLE iceberg.testdb.customer_orders (
CREATE TABLE example.testdb.customer_orders (
order_id BIGINT,
order_date DATE,
account_number BIGINT,
Expand All @@ -724,7 +724,7 @@ For example, you could find the snapshot IDs for the ``customer_orders`` table
by running the following query::

SELECT snapshot_id
FROM iceberg.testdb."customer_orders$snapshots"
FROM example.testdb."customer_orders$snapshots"
ORDER BY committed_at DESC

Time travel queries
Expand All @@ -739,29 +739,29 @@ snapshot identifier corresponding to the version of the table that
needs to be retrieved::

SELECT *
FROM iceberg.testdb.customer_orders FOR VERSION AS OF 8954597067493422955
FROM example.testdb.customer_orders FOR VERSION AS OF 8954597067493422955

A different approach of retrieving historical data is to specify
a point in time in the past, such as a day or week ago. The latest snapshot
of the table taken before or at the specified timestamp in the query is
internally used for providing the previous state of the table::

SELECT *
FROM iceberg.testdb.customer_orders FOR TIMESTAMP AS OF TIMESTAMP '2022-03-23 09:59:29.803 Europe/Vienna'
FROM example.testdb.customer_orders FOR TIMESTAMP AS OF TIMESTAMP '2022-03-23 09:59:29.803 Europe/Vienna'

Rolling back to a previous snapshot
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Use the ``$snapshots`` metadata table to determine the latest snapshot ID of the table like in the following query::

SELECT snapshot_id
FROM iceberg.testdb."customer_orders$snapshots"
FROM example.testdb."customer_orders$snapshots"
ORDER BY committed_at DESC LIMIT 1

The procedure ``system.rollback_to_snapshot`` allows the caller to roll back
the state of the table to a previous snapshot id::

CALL iceberg.system.rollback_to_snapshot('testdb', 'customer_orders', 8954597067493422955)
CALL example.system.rollback_to_snapshot('testdb', 'customer_orders', 8954597067493422955)

Schema evolution
----------------
Expand All @@ -781,14 +781,14 @@ The procedure ``system.register_table`` allows the caller to register an
existing Iceberg table in the metastore, using its existing metadata and data
files::

CALL iceberg.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44')
CALL example.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44')

In addition, you can provide a file name to register a table
with specific metadata. This may be used to register the table with
some specific table state, or may be necessary if the connector cannot
automatically figure out the metadata version to use::

CALL iceberg.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44', metadata_file_name => '00003-409702ba-4735-4645-8f14-09537cc0b2c8.metadata.json')
CALL example.system.register_table(schema_name => 'testdb', table_name => 'customer_orders', table_location => 'hdfs://hadoop-master:9000/user/hive/warehouse/customer_orders-581fad8517934af6be1857a903559d44', metadata_file_name => '00003-409702ba-4735-4645-8f14-09537cc0b2c8.metadata.json')

To prevent unauthorized users from accessing data, this procedure is disabled by default.
The procedure is enabled only when ``iceberg.register-table-procedure.enabled`` is set to ``true``.
Expand Down Expand Up @@ -835,7 +835,7 @@ Property name Description
================================================== ================================================================

The table definition below specifies format Parquet, partitioning by columns ``c1`` and ``c2``,
and a file system location of ``/var/my_tables/test_table``::
and a file system location of ``/var/example_tables/test_table``::

CREATE TABLE test_table (
c1 integer,
Expand All @@ -844,18 +844,18 @@ and a file system location of ``/var/my_tables/test_table``::
WITH (
format = 'PARQUET',
partitioning = ARRAY['c1', 'c2'],
location = '/var/my_tables/test_table')
location = '/var/example_tables/test_table')

The table definition below specifies format ORC, bloom filter index by columns ``c1`` and ``c2``,
fpp is 0.05, and a file system location of ``/var/my_tables/test_table``::
fpp is 0.05, and a file system location of ``/var/example_tables/test_table``::

CREATE TABLE test_table (
c1 integer,
c2 date,
c3 double)
WITH (
format = 'ORC',
location = '/var/my_tables/test_table',
location = '/var/example_tables/test_table',
orc_bloom_filter_columns = ARRAY['c1', 'c2'],
orc_bloom_filter_fpp = 0.05)

Expand All @@ -876,18 +876,18 @@ can be selected directly, or used in conditional statements. For example, you
can inspect the file path for each record::

SELECT *, "$path", "$file_modified_time"
FROM iceberg.web.page_views;
FROM example.web.page_views;

Retrieve all records that belong to a specific file using ``"$path"`` filter::

SELECT *
FROM iceberg.web.page_views
FROM example.web.page_views
WHERE "$path" = '/usr/iceberg/table/web.page_views/data/file_01.parquet'

Retrieve all records that belong to a specific file using ``"$file_modified_time"`` filter::

SELECT *
FROM iceberg.web.page_views
FROM example.web.page_views
WHERE "$file_modified_time" = CAST('2022-07-01 01:02:03.456 UTC' AS timestamp with time zone)

.. _iceberg-metadata-tables:
Expand Down
12 changes: 6 additions & 6 deletions docs/src/main/sphinx/connector/table-redirection.fragment
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ Therefore, a metastore database can hold a variety of tables with different tabl
As a concrete example, let's use the following
simple scenario which makes use of table redirection::

USE a-catalog.myschema;
USE example.example_schema;

EXPLAIN SELECT * FROM mytable;
EXPLAIN SELECT * FROM example_table;

.. code-block:: text

Expand All @@ -22,16 +22,16 @@ simple scenario which makes use of table redirection::
...
Output[columnNames = [...]]
│ ...
└─ TableScan[table = another-catalog:myschema:mytable]
└─ TableScan[table = another_catalog:example_schema:example_table]
...

The output of the ``EXPLAIN`` statement points out the actual
catalog which is handling the ``SELECT`` query over the table ``mytable``.
catalog which is handling the ``SELECT`` query over the table ``example_table``.

The table redirection functionality works also when using
fully qualified names for the tables::

EXPLAIN SELECT * FROM a-catalog.myschema.mytable;
EXPLAIN SELECT * FROM example.example_schema.example_table;

.. code-block:: text

Expand All @@ -41,7 +41,7 @@ fully qualified names for the tables::
...
Output[columnNames = [...]]
│ ...
└─ TableScan[table = another-catalog:myschema:mytable]
└─ TableScan[table = another_catalog:example_schema:example_table]
...

Trino offers table redirection support for the following operations:
Expand Down