Skip to content

Commit f53154a

Browse files
committed
Addressed Josh's comments.
1 parent ce299e4 commit f53154a

File tree

4 files changed

+9
-9
lines changed

4 files changed

+9
-9
lines changed

docs/_layouts/global.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
<!-- Google analytics script -->
3434
<script type="text/javascript">
3535
var _gaq = _gaq || [];
36-
_gaq.push(['_setAccount', 'UA-32518208-1']);
36+
_gaq.push(['_setAccount', 'UA-32518208-2']);
3737
_gaq.push(['_trackPageview']);
3838

3939
(function() {

docs/streaming-custom-receivers.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ there are two kinds of receivers based on their reliability and fault-tolerance
212212

213213
To implement a *reliable receiver*, you have to use `store(multiple-records)` to store data.
214214
This flavour of `store` is a blocking call which returns only after all the given records have
215-
been stored inside Spark. If replication is enabled receiver's configured storage level
215+
been stored inside Spark. If the receiver's configured storage level uses replication
216216
(enabled by default), then this call returns after replication has completed.
217217
Thus it ensures that the data is reliably stored, and the receiver can now acknowledge the
218218
source appropriately. This ensures that no data is caused when the receiver fails in the middle
@@ -226,7 +226,7 @@ not get the reliability guarantees of `store(multiple-records)`, it has the foll
226226
- The system takes care of chunking that data into appropriate sized blocks (look for block
227227
interval in the [Spark Streaming Programming Guide](streaming-programming-guide.html)).
228228
- The system takes care of controlling the receiving rates if the rate limits have been specified.
229-
- Because of these two, *unreliable receivers are simpler to implement than reliable receivers.
229+
- Because of these two, unreliable receivers are simpler to implement than reliable receivers.
230230

231231
The following table summarizes the characteristics of both types of receivers
232232

@@ -240,7 +240,7 @@ The following table summarizes the characteristics of both types of receivers
240240
<td>
241241
Simple to implement.<br>
242242
System takes care of block generation and rate control.
243-
No fault-tolerance guarantees, can loose data on receiver failure.
243+
No fault-tolerance guarantees, can lose data on receiver failure.
244244
</td>
245245
</tr>
246246
<tr>

docs/streaming-flume-integration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ Instead of Flume pushing data directly to Spark Streaming, this approach runs a
7272
and transactions to pull data from the sink. Transactions succeed only after data is received and
7373
replicated by Spark Streaming.
7474

75-
This ensures that stronger reliability and
75+
This ensures stronger reliability and
7676
[fault-tolerance guarantees](streaming-programming-guide.html#fault-tolerance-semantics)
7777
than the previous approach. However, this requires configuring Flume to run a custom sink.
7878
Here are the configuration steps.

docs/streaming-programming-guide.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1568,15 +1568,15 @@ To run a Spark Streaming applications, you need to have the following.
15681568

15691569

15701570
- *[Experimental in Spark 1.2] Configuring write ahead logs* - In Spark 1.2,
1571-
we have introduced a new experimental feature of write ahead logs for achieved strong
1571+
we have introduced a new experimental feature of write ahead logs for achieving strong
15721572
fault-tolerance guarantees. If enabled, all the data received from a receiver gets written into
15731573
a write ahead log in the configuration checkpoint directory. This prevents data loss on driver
15741574
recovery, thus ensuring zero data loss (discussed in detail in the
15751575
[Fault-tolerance Semantics](#fault-tolerance-semantics) section). This can be enabled by setting
15761576
the [configuration parameter](configuration.html#spark-streaming)
1577-
`spark.streaming.receiver.writeAheadLogs.enable` to `true`. However, this stronger semantics may
1578-
come at the cost of the receiving throughput of individual receivers. can be corrected by running
1579-
[more receivers in parallel](#level-of-parallelism-in-data-receiving)
1577+
`spark.streaming.receiver.writeAheadLogs.enable` to `true`. However, these stronger semantics may
1578+
come at the cost of the receiving throughput of individual receivers. This can be corrected by
1579+
running [more receivers in parallel](#level-of-parallelism-in-data-receiving)
15801580
to increase aggregate throughput. Additionally, it is recommended that the replication of the
15811581
received data within Spark be disabled when the write ahead log is enabled as the log is already
15821582
stored in a replicated storage system. This can be done by setting the storage level for the

0 commit comments

Comments
 (0)