diff --git a/docs/configuration.md b/docs/configuration.md index 2687f542b8bd..c77a01e98338 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1004,7 +1004,7 @@ Apart from these, the following properties are also available, and may be useful
spark.storage.replication.proactivespark.storage.replication.proactivespark.storage.replication.policyorg.apache.spark.storage.BasicBlockReplicationPolicy, which makes use of the
+ topology information of the hosts to choose the peers, much like the HDFS blocks replication
+ strategy: it would try to choose the first replica within the same rack, and a third replica on
+ a different rack. See spark.storage.replication.topologyMapper below for how to
+ provide the topology information for the hosts.
+ spark.storage.replication.topologyMapperorg.apache.spark.storage.TopologyMapper, which can be configured by
+ this property. A default implementation that assumes all hosts are in the same rack is provided
+ by org.apache.spark.storage.DefaultTopologyMapper. A file-based implementation is
+ provided by org.apache.spark.storage.FileBasedTopologyMapper, which reads the
+ topology information from the file org.apache.spark.storage.topologyFile. Each line
+ of this file is of the format of host1 = /rack1 and provides a mapping from a host
+ name to its rack information. Note: This configuration only takes effect when
+ spark.storage.replication.policy is set to a a policy that takes the topology
+ information into consideration, e.g.
+ org.apache.spark.storage.BasicBlockReplicationPolicy.
+