diff --git a/docs/src/main/sphinx/develop.rst b/docs/src/main/sphinx/develop.rst index b8abe2cf1909..fad497456e27 100644 --- a/docs/src/main/sphinx/develop.rst +++ b/docs/src/main/sphinx/develop.rst @@ -10,6 +10,7 @@ This guide is intended for Trino contributors and plugin developers. develop/spi-overview develop/connectors develop/example-http + develop/insert develop/delete-and-update develop/types develop/functions diff --git a/docs/src/main/sphinx/develop/delete-and-update.rst b/docs/src/main/sphinx/develop/delete-and-update.rst index f94461ed4460..d026ee0dcd68 100644 --- a/docs/src/main/sphinx/develop/delete-and-update.rst +++ b/docs/src/main/sphinx/develop/delete-and-update.rst @@ -16,28 +16,28 @@ methods to get a "rowId" column handle; to start the operation; and to finish th connector's ``ConnectorPageSource``, to read pages on behalf of the Trino engine, and to write deletions and/or updates to the underlying data store. * The connector's ``UpdatablePageSource.getNextPage()`` implementation fetches the next page - from the underlying ``ConnectorPageSource``, optionally reformats the page, and returns it + from the underlying ``ConnectorPageSource``, optionally rebuild the page, and returns it to the Trino engine. * The Trino engine performs filtering and projection on the page read, producing a page of filtered, projected results. * The Trino engine passes that filtered, projected page of results to the connector's - ``UpdatablePageSource`` ``deleteRows()`` or ``updateRows()`` method. Those methods persist + ``UpdatablePageSource`` ``deleteRows()`` or ``updateRows()`` method. Those methods persist the deletions or updates in the underlying data store. * When all the pages for a specific split have been processed, the Trino engine calls - ``UpdatablePageSource.finish()``, which returns a ``Collection`` of "fragments" + ``UpdatablePageSource.finish()``, which returns a ``Collection`` of fragments representing connector-specific information about the rows processed by the calls to ``deleteRows`` or ``updateRows``. * When all pages for all splits have been processed, the Trino engine calls ``ConnectorMetadata.finishDelete()`` or - ``finishUpdate``, passing a collection containing all the "fragments" from all the splits. The connector + ``finishUpdate``, passing a collection containing all the fragments from all the splits. The connector does what is required to finalize the operation, for example, committing the transaction. The rowId Column Abstraction ============================ The Trino engine and connectors use a "rowId" column handle abstraction to agree on the identities of rows -to be updated or deleted. The rowId column handle is opaque to the Trino engine. Depending on the connector, -the rowId column handle abstraction could represent several physical columns. For the JDBC connector, the rowId -column handle points might be the primary key for the table. For deletion in Hive ACID tables, the rowId consists +to be updated or deleted. The rowId column handle is opaque to the Trino engine. Depending on the connector, +the rowId column handle abstraction could represent several physical columns. For the JDBC connector, the rowId +column handle points might be the primary key for the table. For deletion in Hive ACID tables, the rowId consists of the three ACID columns that uniquely identify rows. The rowId Column for ``DELETE`` @@ -56,8 +56,8 @@ The rowId Column for ``UPDATE`` The Trino engine identifies rows to be updated using a connector-specific rowId column handle, returned by the connector's ``ConnectorMetadata.getUpdateRowIdColumnHandle()`` -method. In addition to the columns that identify the row, for ``UPDATE`` the rowId column will contain -any columns that the connector requires in order to perform the ``UPDATE`` operation. In Hive ACID, for example, +method. In addition to the columns that identify the row, for ``UPDATE`` the rowId column will contain +any columns that the connector requires in order to perform the ``UPDATE`` operation. In Hive ACID, for example, the rowId column contains the values of all columns *not* updated by the ``UPDATE`` operation, since Hive ACID implements ``UPDATE`` as a ``DELETE`` paired with an INSERT. @@ -65,20 +65,20 @@ UpdatablePageSource API ======================= As mentioned above, to support ``DELETE`` or ``UPDATE``, the connector must define a subclass of -``UpdatablePageSource``, layered over the connector's usual ``ConnectorPageSource``. The interesting methods are: +``UpdatablePageSource``, layered over the connector's usual ``ConnectorPageSource``. The interesting methods are: -* ``Page getNextPage()``. When the Trino engine calls ``getNextPage()``, the ``UpdatablePageSource`` calls - its underlying ``ConnectorPageSource.getNextPage()`` method to get a page. Some connectors will reformat +* ``Page getNextPage()``. When the Trino engine calls ``getNextPage()``, the ``UpdatablePageSource`` calls + its underlying ``ConnectorPageSource.getNextPage()`` method to get a page. Some connectors will rebuild the page before returning it to the Trino engine. -* ``void deleteRows(Block rowIds)``. The Trino engine calls the ``deleteRows()`` method of the same ``UpdatablePageSource`` - instance that supplied the original page, passing a block of "rowIds", created by the Trino engine based on the column +* ``void deleteRows(Block rowIds)``. The Trino engine calls the ``deleteRows()`` method of the same ``UpdatablePageSource`` + instance that supplied the original page, passing a block of ``rowIds``, created by the Trino engine based on the column handle returned by ``ConnectorMetadata.getDeleteRowIdColumnHandle()`` -* ``void updateRows(Page page, List columnValueAndRowIdChannels)``. The Trino engine calls the ``updateRows()`` +* ``void updateRows(Page page, List columnValueAndRowIdChannels)``. The Trino engine calls the ``updateRows()`` method of the same ``UpdatablePageSource`` instance that supplied the original page, passing a page of projected columns, - one for each updated column and the last one for the rowId column. The order of projected columns is defined by the Trino engine, - and that order is reflected in the ``columnValueAndRowIdChannels`` argument. The job of ``updateRows()`` is to: + one for each updated column and the last one for the rowId column. The order of projected columns is defined by the Trino engine, + and that order is reflected in the ``columnValueAndRowIdChannels`` argument. The job of ``updateRows()`` is to: * Extract the updated column blocks and the rowId block from the projected page. * Assemble them in whatever order is required by the connector for storage. @@ -88,9 +88,9 @@ As mentioned above, to support ``DELETE`` or ``UPDATE``, the connector must defi previous contents of the updated rows, and a separate file that inserts completely new rows containing the updated and non-updated column values. -* ``CompletableFuture> finish()``. The Trino engine calls ``finish()`` when all the pages - of a split have been processed. The connector returns a future containing a collection of ``Slice``, representing - connector-specific information about the rows processed. Usually this will include the row count, and might +* ``CompletableFuture> finish()``. The Trino engine calls ``finish()`` when all the pages + of a split have been processed. The connector returns a future containing a collection of ``Slice``, representing + connector-specific information about the rows processed. Usually this will include the row count, and might include information like the files or partitions created or changed. ``ConnectorMetadata`` ``DELETE`` API @@ -121,11 +121,11 @@ A connector implementing ``DELETE`` must specify three ``ConnectorMetadata`` met passing the ``session`` and ``tableHandle``. ``beginDelete()`` performs any orchestration needed in the connector to start processing the ``DELETE``. - This orchestration varies from connector to connector. In the Hive ACID connector, for example, ``beginDelete()`` + This orchestration varies from connector to connector. In the Hive ACID connector, for example, ``beginDelete()`` checks that the table is transactional and starts a Hive Metastore transaction. ``beginDelete()`` returns a ``ConnectorTableHandle`` with any added information the connector needs when the handle - is passed back to ``finishDelete()`` and the split generation machinery. For most connectors, the returned table + is passed back to ``finishDelete()`` and the split generation machinery. For most connectors, the returned table handle contains a flag identifying the table handle as a table handle for a ``DELETE`` operation. * ``finishDelete()``:: @@ -137,7 +137,7 @@ A connector implementing ``DELETE`` must specify three ``ConnectorMetadata`` met During ``DELETE`` processing, the Trino engine accumulates the ``Slice`` collections returned by ``UpdatablePageSource.finish()``. After all splits have been processed, the engine calls ``finishDelete()``, passing the table handle and that - collection of ``Slice`` fragments. In response, the connector takes appropriate actions to complete the ``Delete`` operation. + collection of ``Slice`` fragments. In response, the connector takes appropriate actions to complete the ``Delete`` operation. Those actions might include committing the transaction, assuming the connector supports a transaction paradigm. ``ConnectorMetadata`` ``UPDATE`` API @@ -170,17 +170,17 @@ A connector implementing ``UPDATE`` must specify three ``ConnectorMetadata`` met List updatedColumns) As the last step in creating the ``UPDATE`` execution plan, the connector's ``beginUpdate()`` method is called, - passing arguments that define the ``UPDATE`` to the connector. In addition to the ``session`` + passing arguments that define the ``UPDATE`` to the connector. In addition to the ``session`` and ``tableHandle``, the arguments includes the list of the updated columns handles, in table column order. ``beginUpdate()`` performs any orchestration needed in the connector to start processing the ``UPDATE``. - This orchestration varies from connector to connector. In the Hive ACID connector, for example, ``beginUpdate()`` + This orchestration varies from connector to connector. In the Hive ACID connector, for example, ``beginUpdate()`` starts the Hive Metastore transaction; checks that the updated table is transactional and that neither - parition columns nor bucket columns are updated. + partition columns nor bucket columns are updated. ``beginUpdate`` returns a ``ConnectorTableHandle`` with any added information the connector needs when the handle - is passed back to ``finishUpdate()`` and the split generation machinery. For most connectors, the returned table - handle contains a flag identifying the table handle as a table handle for a ``UPDATE`` operation. For some connectors + is passed back to ``finishUpdate()`` and the split generation machinery. For most connectors, the returned table + handle contains a flag identifying the table handle as a table handle for a ``UPDATE`` operation. For some connectors that support partitioning, the table handle will reflect that partitioning. * ``finishUpdate``:: @@ -192,5 +192,5 @@ A connector implementing ``UPDATE`` must specify three ``ConnectorMetadata`` met During ``UPDATE`` processing, the Trino engine accumulates the ``Slice`` collections returned by ``UpdatablePageSource.finish()``. After all splits have been processed, the engine calls ``finishUpdate()``, passing the table handle and that - collection of ``Slice`` fragments. In response, the connector takes appropriate actions to complete the ``UPDATE`` operation. + collection of ``Slice`` fragments. In response, the connector takes appropriate actions to complete the ``UPDATE`` operation. Those actions might include committing the transaction, assuming the connector supports a transaction paradigm. diff --git a/docs/src/main/sphinx/develop/insert.rst b/docs/src/main/sphinx/develop/insert.rst new file mode 100644 index 000000000000..87403ae769d7 --- /dev/null +++ b/docs/src/main/sphinx/develop/insert.rst @@ -0,0 +1,30 @@ +============================================= +Supporting ``INSERT`` and ``CREATE TABLE AS`` +============================================= + +To support ``INSERT``, a connector must implement: + +* ``beginInsert()`` and ``finishInsert()`` from the ``ConnectorMetadata`` + interface; +* a ``ConnectorPageSinkProvider`` that receives a table handle and returns + a ``ConnectorPageSink``. + +When executing an ``INSERT`` statement, the engine calls the ``beginInsert()`` +method in the connector, which receives a table handle and a list of columns. +It should return a ``ConnectorInsertTableHandle``, that can carry any +connector specific information, and it's passed to the page sink provider. +The ``PageSinkProvider`` creates a page sink, that accepts ``Page`` objects. + +When all the pages for a specific split have been processed, Trino calls +``ConnectorPageSink.finish()``, which returns a ``Collection`` +of fragments representing connector-specific information about the processed +rows. + +When all pages for all splits have been processed, Trino calls +``ConnectorMetadata.finishInsert()``, passing a collection containing all +the fragments from all the splits. The connector does what is required +to finalize the operation, for example, committing the transaction. + +To support ``CREATE TABLE AS``, the ``ConnectorPageSinkProvider`` must also +return a page sink when receiving a ``ConnectorOutputTableHandle``. This handle +is returned from ``ConnectorMetadata.beginCreateTable()``.