add docs on the metrics-as-json api

squito · squito · commit cc1febf425d2 · 2015-05-04T20:22:07.000-05:00
diff --git a/docs/monitoring.md b/docs/monitoring.md
@@ -174,6 +174,79 @@ making it easy to identify slow tasks, data skew, etc.
 
 Note that the history server only displays completed Spark jobs. One way to signal the completion of a Spark job is to stop the Spark Context explicitly (`sc.stop()`), or in Python using the `with SparkContext() as sc:` to handle the Spark Context setup and tear down, and still show the job history on the UI.
 
+## REST API
+
+In addition to viewing the metrics in the UI, they are also available as JSON.  This gives developers
+an easy way to create new visualizations and monitoring tools for Spark.  The JSON is available for
+both running applications, an in the history server.  The endpoints are mounted at `/json/v1`.  Eg.,
+for the history server, they would typically be accessible at `http://<server_url>:18080/json/v1`.
+
+<table class="table">
+  <tr><th>Endpoint</th><th>Meaning</th></tr>
+  <tr>
+    <td>`/applications`</td>
+    <td>A list of all applications</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/jobs`</td>
+    <td>A list of all jobs for a given application</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/jobs/<job_id>`</td>
+    <td>Details for one job</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/stages`</td>
+    <td>A list of all stages for a given application</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/stages/<stage_id>`</td>
+    <td>A list of all attempts for a given stage</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/stages/<stage_id>/<stage_attempt_id>`</td>
+    <td>Details for the given stage attempt</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/stages/<stage_id>/<stage_attempt_id>/taskSummary`</td>
+    <td>Summary metrics of all tasks in a stage attempt</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/stages/<stage_id>/<stage_attempt_id>/taskList`</td>
+    <td>A list of all tasks for a given stage attempt</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/executors`</td>
+    <td>A list of all executors for the given application</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/storage/rdd`</td>
+    <td>A list of stored RDDs for the given application</td>
+  </tr>
+  <tr>
+    <td>`/applications/<app_id>/storage/rdd/<rdd_id>`</td>
+    <td>Details for the storage status of a given RDD</td>
+  </tr>
+</table>
+
+When running on Yarn, each application has multiple attempts, so `<app_id>` is really a compound key,
+with `<app_id>/<attempt_id>`.
+
+These endpoints have been strongly versioned to make it easier to develop applications on top of these
+endpoints, without knowledge of spark internal classes.  In particular, Spark guarantees:
+
+* Endpoints will never be removed from one version
+* Individual fields will never be removed for any given endpoint
+* New endpoints may be added
+* New fields may be added to existing endpoints
+* New versions of the api may be added in the future, which are free to be completely incompatible with earlier versions
+* Api versions may be dropped, but only after at least one minor release of existing beside a new api version
+
+Note that even when examining the UI of a running applications, the `applications/<app_id>` portion is
+still required, though there is only one application available.  Eg. to see the list of jobs for the
+running app, you would go to `http://localhost:4040/json/v1/applications/<app_id>/jobs`.  This is to
+keep the paths consistent in both modes.
+
 # Metrics
 
 Spark has a configurable metrics system based on the