From 7a554dc6c0da01013c3d9da30a15a0fc0928ea20 Mon Sep 17 00:00:00 2001 From: Haoyuan Li Date: Tue, 3 Dec 2019 01:02:59 -0800 Subject: [PATCH] Update hive.rst with Alluxio docs Documentation for this PR: https://github.com/prestodb/presto/pull/13743 Update hive.rst --- .../src/main/sphinx/connector/hive.rst | 42 +++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/presto-docs/src/main/sphinx/connector/hive.rst b/presto-docs/src/main/sphinx/connector/hive.rst index 27f0c22f89b17..1862bdd090cb4 100644 --- a/presto-docs/src/main/sphinx/connector/hive.rst +++ b/presto-docs/src/main/sphinx/connector/hive.rst @@ -423,6 +423,48 @@ If your workload experiences the error *Timeout waiting for connection from pool*, increase the value of both ``hive.s3select-pushdown.max-connections`` and the maximum connections configuration for the file system you are using. +Alluxio Configuration +--------------------- + +Presto can read and write tables stored in the Alluxio Data Orchestration System +`Alluxio `_, +leveraging Alluxio's distributed block-level read/write caching functionality. +The tables must be created in the Hive metastore with the ``alluxio://`` location prefix +(see `Running Apache Hive with Alluxio `_ +for details and examples). +Presto queries will then transparently retrieve and cache files +or objects from a variety of disparate storage systems including HDFS and S3. + +Alluxio Client-Side Configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To configure Alluxio client-side properties on Presto, append the Alluxio +configuration directory (``${ALLUXIO_HOME}/conf``) to the Presto JVM classpath, +so that the Alluxio properties file ``alluxio-site.properties`` can be loaded as a resource. +Update the Presto :ref:`presto_jvm_config` file ``etc/jvm.config`` to include the following: + +.. code-block:: none + + -Xbootclasspath/a: + +The advantage of this approach is that all the Alluxio properties are set in +the single ``alluxio-site.properties`` file. For details, see `Customize Alluxio User Properties +`_. + +Alternatively, add Alluxio configuration properties to the Hadoop configuration +files (``core-site.xml``, ``hdfs-site.xml``) and configure the Hive connector +to use the `Hadoop configuration files <#hdfs-configuration>`__ via the +``hive.config.resources`` connector property. + +Deploy Alluxio with Presto +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To achieve the best performance running Presto on Alluxio, it is recommended +to collocate Presto workers with Alluxio workers. This allows reads and writes +to bypass the network. See `Performance Tuning Tips for Presto with Alluxio +`_ +for more details. + Table Statistics ----------------