[SPARK-28609][DOC] Fix broken styles/links and make up-to-date by dongjoon-hyun · Pull Request #25345 · apache/spark

dongjoon-hyun · 2019-08-03T05:32:08Z

What changes were proposed in this pull request?

This PR aims to fix the broken styles/links and make the doc up-to-date for Apache Spark 2.4.4 and 3.0.0 release.

building-spark.md
configuration.md
sql-pyspark-pandas-with-arrow.md
streaming-programming-guide.md
structured-streaming-programming-guide.md (1/2)
structured-streaming-programming-guide.md (2/2)
submitting-applications.md

How was this patch tested?

Manual. Build the doc.

SKIP_API=1 jekyll build

docs/building-spark.md

dongjoon-hyun · 2019-08-03T05:43:27Z

docs/configuration.md

    Useful for allowing Spark to resolve artifacts from behind a firewall e.g. via an in-house
    artifact server like Artifactory. Details on the settings file format can be
-    found at http://ant.apache.org/ivy/history/latest-milestone/settings.html
+    found at <a href="http://ant.apache.org/ivy/history/latest-milestone/settings.html">Settings Files</a>


Hyperlink is better than the text. Note that Settings Files is the title of that page.

dongjoon-hyun · 2019-08-03T05:44:11Z

docs/structured-streaming-programming-guide.md

 Internally, by default, Structured Streaming queries are processed using a *micro-batch processing* engine, which processes data streams as a series of small batch jobs thereby achieving end-to-end latencies as low as 100 milliseconds and exactly-once fault-tolerance guarantees. However, since Spark 2.3, we have introduced a new low-latency processing mode called **Continuous Processing**, which can achieve end-to-end latencies as low as 1 millisecond with at-least-once guarantees. Without changing the Dataset/DataFrame operations in your queries, you will be able to choose the mode based on your application requirements. 

-In this guide, we are going to walk you through the programming model and the APIs. We are going to explain the concepts mostly using the default micro-batch processing model, and then [later](#continuous-processing-experimental) discuss Continuous Processing model. First, let's start with a simple example of a Structured Streaming query - a streaming word count.
+In this guide, we are going to walk you through the programming model and the APIs. We are going to explain the concepts mostly using the default micro-batch processing model, and then [later](#continuous-processing) discuss Continuous Processing model. First, let's start with a simple example of a Structured Streaming query - a streaming word count.


This fixes the broken link. Previously, it points like the following instead of that section.

https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#continuous-processing-experimental

dongjoon-hyun · 2019-08-03T05:47:19Z

docs/structured-streaming-programming-guide.md

 - Joins can be cascaded, that is, you can do `df1.join(df2, ...).join(df3, ...).join(df4, ....)`.

- As of Spark 2.3, you can use joins only when the query is in Append output mode. Other output modes are not yet supported.
+- As of Spark 2.4, you can use joins only when the query is in Append output mode. Other output modes are not yet supported.


For now, it's 2.4 because this patch will be backported to branch-2.4. We will update this later when we prepare 3.0.0.

dongjoon-hyun · 2019-08-03T05:49:49Z

docs/structured-streaming-programming-guide.md

 For that situation you must specify the processing logic in an object.

-1. The function takes a row as input.
+- First, the function takes a row as input.


The following is the current screenshot of our 2.4.3 page. The enumeration doesn't work well with template. Also, the python example has redundant leading spaces inconsistently from the other Python examples.

dongjoon-hyun · 2019-08-03T05:51:30Z

docs/structured-streaming-programming-guide.md

-  .format("rate")
-  .option("rowsPerSecond", "10")
-  .option("")
-


This seems to be incomplete leftover. We had better remove this because the other language tabs don't have this. After removing this, we will have Kafka examples consistently across all languages.

dongjoon-hyun · 2019-08-03T05:52:07Z

docs/submitting-applications.md

 <tr><td> <code>k8s://HOST:PORT</code> </td><td> Connect to a <a href="running-on-kubernetes.html">Kubernetes</a> cluster in
        <code>cluster</code> mode. Client mode is currently unsupported and will be supported in future releases.
-        The <code>HOST</code> and <code>PORT</code> refer to the [Kubernetes API Server](https://kubernetes.io/docs/reference/generated/kube-apiserver/).
+        The <code>HOST</code> and <code>PORT</code> refer to the <a href="https://kubernetes.io/docs/reference/generated/kube-apiserver/">Kubernetes API Server</a>.


Inside table, we need to use a tag like line 183.

SparkQA · 2019-08-03T23:42:39Z

Test build #108598 has finished for PR 25345 at commit 32b2265.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-08-04T16:42:07Z

Thank you for review and approval, @HyukjinKwon .
Merged to master/2.4.

This PR aims to fix the broken styles/links and make the doc up-to-date for Apache Spark 2.4.4 and 3.0.0 release. - `building-spark.md` ![Screen Shot 2019-08-02 at 10 33 51 PM](https://user-images.githubusercontent.com/9700541/62407962-a248ec80-b575-11e9-8a16-532e9bc421f8.png) - `configuration.md` ![Screen Shot 2019-08-02 at 10 34 52 PM](https://user-images.githubusercontent.com/9700541/62407969-c7d5f600-b575-11e9-9b1a-a76c6cc095c5.png) - `sql-pyspark-pandas-with-arrow.md` ![Screen Shot 2019-08-02 at 10 36 14 PM](https://user-images.githubusercontent.com/9700541/62407979-18e5ea00-b576-11e9-99af-7ad9264656ae.png) - `streaming-programming-guide.md` ![Screen Shot 2019-08-02 at 10 37 11 PM](https://user-images.githubusercontent.com/9700541/62407981-213e2500-b576-11e9-8bc5-a925df7e98a7.png) - `structured-streaming-programming-guide.md` (1/2) ![Screen Shot 2019-08-02 at 10 38 20 PM](https://user-images.githubusercontent.com/9700541/62408001-49c61f00-b576-11e9-9519-f699775ceecd.png) - `structured-streaming-programming-guide.md` (2/2) ![Screen Shot 2019-08-02 at 10 40 05 PM](https://user-images.githubusercontent.com/9700541/62408017-7f6b0800-b576-11e9-9341-52664ba6b460.png) - `submitting-applications.md` ![Screen Shot 2019-08-02 at 10 41 13 PM](https://user-images.githubusercontent.com/9700541/62408027-b2ad9700-b576-11e9-910e-8f22173e1251.png) Manual. Build the doc. ``` SKIP_API=1 jekyll build ``` Closes #25345 from dongjoon-hyun/SPARK-28609. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit 4856c0e) Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

This PR aims to fix the broken styles/links and make the doc up-to-date for Apache Spark 2.4.4 and 3.0.0 release. - `building-spark.md` ![Screen Shot 2019-08-02 at 10 33 51 PM](https://user-images.githubusercontent.com/9700541/62407962-a248ec80-b575-11e9-8a16-532e9bc421f8.png) - `configuration.md` ![Screen Shot 2019-08-02 at 10 34 52 PM](https://user-images.githubusercontent.com/9700541/62407969-c7d5f600-b575-11e9-9b1a-a76c6cc095c5.png) - `sql-pyspark-pandas-with-arrow.md` ![Screen Shot 2019-08-02 at 10 36 14 PM](https://user-images.githubusercontent.com/9700541/62407979-18e5ea00-b576-11e9-99af-7ad9264656ae.png) - `streaming-programming-guide.md` ![Screen Shot 2019-08-02 at 10 37 11 PM](https://user-images.githubusercontent.com/9700541/62407981-213e2500-b576-11e9-8bc5-a925df7e98a7.png) - `structured-streaming-programming-guide.md` (1/2) ![Screen Shot 2019-08-02 at 10 38 20 PM](https://user-images.githubusercontent.com/9700541/62408001-49c61f00-b576-11e9-9519-f699775ceecd.png) - `structured-streaming-programming-guide.md` (2/2) ![Screen Shot 2019-08-02 at 10 40 05 PM](https://user-images.githubusercontent.com/9700541/62408017-7f6b0800-b576-11e9-9341-52664ba6b460.png) - `submitting-applications.md` ![Screen Shot 2019-08-02 at 10 41 13 PM](https://user-images.githubusercontent.com/9700541/62408027-b2ad9700-b576-11e9-910e-8f22173e1251.png) Manual. Build the doc. ``` SKIP_API=1 jekyll build ``` Closes apache#25345 from dongjoon-hyun/SPARK-28609. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit 4856c0e) Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

Init

32b2265

dongjoon-hyun changed the title ~~Init~~ [SPARK-28609][DOC] Fix broken styles/links and make up-to-date Aug 3, 2019

dongjoon-hyun commented Aug 3, 2019

View reviewed changes

docs/building-spark.md Show resolved Hide resolved

dongjoon-hyun commented Aug 3, 2019

View reviewed changes

dongjoon-hyun added the DOCUMENTATION label Aug 3, 2019

HyukjinKwon approved these changes Aug 3, 2019

View reviewed changes

dongjoon-hyun closed this in 4856c0e Aug 4, 2019

dongjoon-hyun deleted the SPARK-28609 branch August 4, 2019 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-28609][DOC] Fix broken styles/links and make up-to-date#25345

[SPARK-28609][DOC] Fix broken styles/links and make up-to-date#25345
dongjoon-hyun wants to merge 1 commit intoapache:masterfrom
dongjoon-hyun:SPARK-28609

dongjoon-hyun commented Aug 3, 2019 •

edited

Loading

Uh oh!

Uh oh!

dongjoon-hyun Aug 3, 2019 •

edited

Loading

Uh oh!

dongjoon-hyun Aug 3, 2019 •

edited

Loading

Uh oh!

dongjoon-hyun Aug 3, 2019

Uh oh!

dongjoon-hyun Aug 3, 2019

Uh oh!

dongjoon-hyun Aug 3, 2019

Uh oh!

dongjoon-hyun Aug 3, 2019

Uh oh!

SparkQA commented Aug 3, 2019

Uh oh!

dongjoon-hyun commented Aug 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

dongjoon-hyun commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

Uh oh!

dongjoon-hyun Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Aug 3, 2019

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Aug 3, 2019

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Aug 3, 2019

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Aug 3, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 3, 2019

Uh oh!

dongjoon-hyun commented Aug 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

dongjoon-hyun commented Aug 3, 2019 •

edited

Loading

dongjoon-hyun Aug 3, 2019 •

edited

Loading

dongjoon-hyun Aug 3, 2019 •

edited

Loading