diff --git a/docs/configuration.md b/docs/configuration.md index 2687f542b8bd..c77a01e98338 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1004,7 +1004,7 @@ Apart from these, the following properties are also available, and may be useful - spark.storage.replication.proactive + spark.storage.replication.proactive false Enables proactive block replication for RDD blocks. Cached RDD block replicas lost due to @@ -1012,6 +1012,40 @@ Apart from these, the following properties are also available, and may be useful to get the replication level of the block to the initial number. + + spark.storage.replication.policy + + org.apache.spark.storage.
RandomBlockReplicationPolicy + + + The policy to use for choosing peers when replicating blocks. The default policy would randomly + choose the peers to replicate to. A more resilient replication policy is provided by + org.apache.spark.storage.BasicBlockReplicationPolicy, which makes use of the + topology information of the hosts to choose the peers, much like the HDFS blocks replication + strategy: it would try to choose the first replica within the same rack, and a third replica on + a different rack. See spark.storage.replication.topologyMapper below for how to + provide the topology information for the hosts. + + + + spark.storage.replication.topologyMapper + + org.apache.spark.storage.
DefaultTopologyMapper + + + The topology information of a host is determined by a topology mapping service defined by the + abstract class org.apache.spark.storage.TopologyMapper, which can be configured by + this property. A default implementation that assumes all hosts are in the same rack is provided + by org.apache.spark.storage.DefaultTopologyMapper. A file-based implementation is + provided by org.apache.spark.storage.FileBasedTopologyMapper, which reads the + topology information from the file org.apache.spark.storage.topologyFile. Each line + of this file is of the format of host1 = /rack1 and provides a mapping from a host + name to its rack information. Note: This configuration only takes effect when + spark.storage.replication.policy is set to a a policy that takes the topology + information into consideration, e.g. + org.apache.spark.storage.BasicBlockReplicationPolicy. + + ### Execution Behavior