Skip to content

Commit 88cac9b

Browse files
committed
[SPARK-4642] Documents about running-on-YARN needs update
Added descriptions about these parameters. - spark.yarn.report.interval - spark.yarn.queue - spark.yarn.user.classpath.first - spark.yarn.scheduler.reporterThread.maxFailures Modified description about the defalut value of this parameter. - spark.yarn.submit.file.replication
1 parent 5d7fe17 commit 88cac9b

File tree

1 file changed

+29
-1
lines changed

1 file changed

+29
-1
lines changed

docs/running-on-yarn.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Most of the configs are the same for Spark on YARN as for other deployment modes
3030
</tr>
3131
<tr>
3232
<td><code>spark.yarn.submit.file.replication</code></td>
33-
<td>3</td>
33+
<td>The default HDFS replication (usually 3)</td>
3434
<td>
3535
HDFS replication level for the files uploaded into HDFS for the application. These include things like the Spark jar, the app jar, and any distributed cache files/archives.
3636
</td>
@@ -42,13 +42,34 @@ Most of the configs are the same for Spark on YARN as for other deployment modes
4242
Set to true to preserve the staged files (Spark jar, app jar, distributed cache files) at the end of the job rather than delete them.
4343
</td>
4444
</tr>
45+
<tr>
46+
<td><code>spark.yarn.user.classpath.first</code></td>
47+
<td>false</td>
48+
<td>
49+
Set to true to make the users app.jar in first order. It is normally last in case conflicts with spark jars.
50+
</td>
51+
</tr>
4552
<tr>
4653
<td><code>spark.yarn.scheduler.heartbeat.interval-ms</code></td>
4754
<td>5000</td>
4855
<td>
4956
The interval in ms in which the Spark application master heartbeats into the YARN ResourceManager.
5057
</td>
5158
</tr>
59+
<tr>
60+
<td><code>spark.yarn.scheduler.reporterThread.maxFailures</code></td>
61+
<td>5</td>
62+
<td>
63+
The number of failures in a row until the Spark application master gives up heartbeating into the YARN ResourceManager.
64+
</td>
65+
</tr>
66+
<tr>
67+
<td><code>spark.yarn.report.interval</code></td>
68+
<td>1000</td>
69+
<td>
70+
The interval in ms in which the YARN client monitors the application status after submits it to the YARN ResourceManager.
71+
</td>
72+
</tr>
5273
<tr>
5374
<td><code>spark.yarn.max.executor.failures</code></td>
5475
<td>numExecutors * 2, with minimum of 3</td>
@@ -91,6 +112,13 @@ Most of the configs are the same for Spark on YARN as for other deployment modes
91112
The amount of off heap memory (in megabytes) to be allocated per driver. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. This tends to grow with the container size (typically 6-10%).
92113
</td>
93114
</tr>
115+
<tr>
116+
<td><code>spark.yarn.queue</code></td>
117+
<td>default</td>
118+
<td>
119+
The YARN queue name which the application is being submitted.
120+
</td>
121+
</tr>
94122
<tr>
95123
<td><code>spark.yarn.jar</code></td>
96124
<td>(none)</td>

0 commit comments

Comments
 (0)