Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 121 additions & 18 deletions docs/src/main/sphinx/develop/connectors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,74 @@ ConnectorFactory

Instances of your connector are created by a ``ConnectorFactory``
instance which is created when Trino calls ``getConnectorFactory()`` on the
plugin. The connector factory is a simple interface responsible for creating an
instance of a ``Connector`` object that returns instances of the
following services:
plugin. The connector factory is a simple interface responsible for providing
the connector name and creating an instance of a ``Connector`` object.
A basic connector implementation that only supports reading, but
not writing data, should return instances of the following services:

* ``ConnectorMetadata``
* ``ConnectorSplitManager``
* ``ConnectorRecordSetProvider``
* :ref:`connector-metadata`
* :ref:`connector-split-manager`
* :ref:`connector-record-set-provider` or :ref:`connector-page-source-provider`

.. _connector-metadata:

ConnectorMetadata
^^^^^^^^^^^^^^^^^

The connector metadata interface has a large number of important
methods that are responsible for allowing Trino to look at lists of
schemas, lists of tables, lists of columns, and other metadata about a
particular data source.
The connector metadata interface allows Trino to get a lists of schemas,
tables, columns, and other metadata about a particular data source.

A basic read-only connector should implement the following methods:

* ``listSchemaNames``
* ``listTables``
* ``streamTableColumns``
* ``getTableHandle``
* ``getTableMetadata``
* ``getColumnHandles``
* ``getColumnMetadata``

If you are interested in seeing strategies for implementing more methods,
look at the :doc:`example-http` and the Cassandra connector. If your underlying
data source supports schemas, tables and columns, this interface should be
straightforward to implement. If you are attempting to adapt something that
is not a relational database (as the Example HTTP connector does), you may
need to get creative about how you map your data source to Trino's schema,
table, and column concepts.

The connector metadata interface allows to also implement other connector
features, like:

* Schema management, that is creating, altering and dropping schemas, tables,
table columns, views, and materialized views.
* Support for table and column comments, and properties.
* Schema, table and view authorization.
* Executing :doc:`table-functions`.
* Providing table statistics used by the CBO and collecting statistics
during writes and when analyzing selected tables.
* Data modification, that is:

* inserting, updating, and deleting rows in tables,
* refreshing materialized views,
* truncating whole tables,
* and creating tables from query results.

This interface is too big to list in this documentation, but if you
are interested in seeing strategies for implementing these methods,
look at the :doc:`example-http` and the Cassandra connector. If
your underlying data source supports schemas, tables and columns, this
interface should be straightforward to implement. If you are attempting
to adapt something that is not a relational database (as the Example HTTP
connector does), you may need to get creative about how you map your
data source to Trino's schema, table, and column concepts.
* Role and grant management.
* Pushing down:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also mention Dynamic filter here as that is a type of pushdown that needs to be explicitly implemented by the connector

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is specifically about ConnectorMetadata. I'd describe dynamic filtering in a follow-up PR and then I can reference it here.

* Limit
* Predicates
* Projections
* Sampling
* Aggregations
* Joins
* Top N - limit with sort items
* Table function invocation

Note that data modification also requires implementing
a :ref:`connector-page-sink-provider`.

.. _connector-split-manager:

ConnectorSplitManager
^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -48,10 +92,69 @@ For data sources that don't have partitioned data, a good strategy
here is to simply return a single split for the entire table. This
is the strategy employed by the Example HTTP connector.

.. _connector-record-set-provider:

ConnectorRecordSetProvider
^^^^^^^^^^^^^^^^^^^^^^^^^^

Given a split and a list of columns, the record set provider is
responsible for delivering data to the Trino execution engine.
It creates a ``RecordSet``, which in turn creates a ``RecordCursor``
that is used by Trino to read the column values for each row.

.. _connector-page-source-provider:

ConnectorPageSourceProvider
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Given a split and a list of columns, the page source provider is
responsible for delivering data to the Trino execution engine.
It creates a ``ConnectorPageSource``, which in turn creates ``Page`` objects
that are used by Trino to read the column values.

If not implemented, a default ``RecordPageSourceProvider`` will be used.
Given a record set provider, it returns an instance of ``RecordPageSource``
that builds ``Page`` objects from records in a record set.

A connector should implement a page source provider instead of a record set
provider when it's possible to create pages directly. The conversion of individual
records from a record set provider into pages adds overheads during query execution.

To add support for updating and/or deleting rows in a connector, it needs
to implement a ``ConnectorPageSourceProvider`` that returns
an ``UpdatablePageSource``. See :doc:`delete-and-update` for more.

.. _connector-page-sink-provider:

ConnectorPageSinkProvider
^^^^^^^^^^^^^^^^^^^^^^^^^

Given an insert table handle, the page sink provider is responsible for
consuming data from the Trino execution engine.
It creates a ``ConnectorPageSink``, which in turn accepts ``Page`` objects
that contains the column values.

Example that shows how to iterate over the page to access single values:

.. code-block:: java

@Override
public CompletableFuture<?> appendPage(Page page)
{
for (int channel = 0; channel < page.getChannelCount(); channel++) {
Block block = page.getBlock(channel);
for (int position = 0; position < page.getPositionCount(); position++) {
if (block.isNull(position)) {
// or handle this differently
continue;
}

// channel should match the column number in the table
// use it to determine the expected column type
String value = VARCHAR.getSlice(block, position).toStringUtf8();
// TODO do something with the value
}
}
return NOT_BLOCKED;
}

40 changes: 32 additions & 8 deletions docs/src/main/sphinx/develop/spi-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,24 @@ SPI overview
============

When you implement a new Trino plugin, you implement interfaces and
override methods defined by the SPI.
override methods defined by the Service Provider Interface (SPI).

Plugins can provide additional:

* :doc:`connectors`,
* block encodings,
* :doc:`types`,
* :doc:`functions`,
* :doc:`system-access-control`,
* :doc:`group-provider`,
* :doc:`password-authenticator`,
* :doc:`header-authenticator`,
* :doc:`certificate-authenticator`,
* :doc:`event-listener`,
* resource group configuration managers,
* session property configuration managers,
* and exchange managers.

Plugins can provide additional :doc:`connectors`, :doc:`types`,
:doc:`functions`, and :doc:`system-access-control`.
In particular, connectors are the source of all data for queries in
Trino: they back each catalog available to Trino.

Expand Down Expand Up @@ -82,11 +96,21 @@ For an example ``pom.xml`` file, see the example HTTP connector in the
Deploying a custom plugin
-------------------------

In order to add a custom plugin to a Trino installation, create a directory
for that plugin in the Trino plugin directory and add all the necessary jars
for the plugin to that directory. For example, for a plugin called
``my-functions``, you would create a directory ``my-functions`` in the Trino
plugin directory and add the relevant jars to that directory.
Because Trino plugins use the ``trino-plugin`` packaging type, building
a plugin will create a ZIP file in the ``target`` directory. This file
contains the plugin JAR and all its dependencies JAR files.

In order to add a custom plugin to a Trino installation, extract the plugin
ZIP file and move the extracted directory into the Trino plugin directory.
For example, for a plugin called ``my-functions``, with a version of 1.0,
you would extract ``my-functions-1.0.zip`` and then move ``my-functions-1.0``
to ``my-functions`` in the Trino plugin directory.

.. note::

Every Trino plugin should be in a separate directory. Do not put JAR files
directly into the ``plugin`` directory. Plugins should only contain JAR files,
so any subdirectories will not be traversed and will be ignored.

By default, the plugin directory is the ``plugin`` directory relative to the
directory in which Trino is installed, but it is configurable using the
Expand Down