From 03c80a091cb5959a50d8e598e8088a406468a67e Mon Sep 17 00:00:00 2001 From: Joseph D Rivera Date: Fri, 28 Aug 2015 14:38:26 -0400 Subject: [PATCH] ZEPPELIN-270 added env varibles for pyspark's correct function in yarn. --- docs/install/yarn_install.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/install/yarn_install.md b/docs/install/yarn_install.md index 549b770b6e4..d2f170505fc 100644 --- a/docs/install/yarn_install.md +++ b/docs/install/yarn_install.md @@ -158,6 +158,8 @@ Set the following properties export JAVA_HOME=/home/zeppelin/prerequisites/jdk1.7.0_79 export HADOOP_CONF_DIR=/etc/hadoop/conf export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.3.1.0-2574" +export PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip" +export SPARK_YARN_USER_ENV="PYTHONPATH=${PYTHONPATH}" ``` As /etc/hadoop/conf contains various configurations of YARN cluster, Zeppelin can now submit Spark/Hive jobs on YARN cluster form its web interface. The value of hdp.version is set to 2.3.1.0-2574. This can be obtained by running the following command