Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion core/src/main/scala/org/apache/spark/util/Utils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -403,7 +403,8 @@ private[spark] object Utils extends Logging {
useCache: Boolean) {
val fileName = url.split("/").last
val targetFile = new File(targetDir, fileName)
if (useCache) {
val fetchCacheEnabled = conf.getBoolean("spark.files.useFetchCache", defaultValue = true)
if (useCache && fetchCacheEnabled) {
val cachedFileName = s"${url.hashCode}${timestamp}_cache"
val lockFileName = s"${url.hashCode}${timestamp}_lock"
val localDir = new File(getLocalDir(conf))
Expand Down
12 changes: 12 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -745,6 +745,18 @@ Apart from these, the following properties are also available, and may be useful
the driver, in seconds.
</td>
</tr>
<tr>
<td><code>spark.files.useFetchCache</code></td>
<td>true</td>
<td>
If file fetching should use local caching. The improves performance when running multiple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reword this slightly in order to match the style of similar boolean options. How about this:

If set to true (default), file fetching will use a local cache that is shared by executors that belong to the same application, which can improve task launching performance when running many executors on the same host. If set to false, these caching optimizations will be disabled and all executors will fetch their own copies of files. This optimization may be disabled in order to use Spark local directories that reside on NFS filesystems (see SPARK-6313 for more details).

executors on the one host and is enabled by default (see
<a href="https://issues.apache.org/jira/browse/SPARK-6313">SPARK-6313</a> for more details).
When set to true (default) caching is enabled. When set to false, caching optimizations are
switched off and no lock files are created, this allows fetchFiles store to reside on a NFS
mount.
</td>
</tr>
<tr>
<td><code>spark.files.overwrite</code></td>
<td>false</td>
Expand Down