@@ -8,16 +8,33 @@ title: Spark SQL Programming Guide
88{: toc }
99
1010# Overview
11+
12+ <div class =" codetabs " >
13+ <div data-lang =" scala " markdown =" 1 " >
14+
1115Spark SQL allows relational queries expressed in SQL, HiveQL, or Scala to be executed using
1216Spark. At the core of this component is a new type of RDD,
1317[ SchemaRDD] ( api/sql/core/index.html#org.apache.spark.sql.SchemaRDD ) . SchemaRDDs are composed
14- [ Row] ( api/sql/catalyst /index.html#org.apache.spark.sql.catalyst.expressions .Row ) objects along with
18+ [ Row] ( api/sql/core /index.html#org.apache.spark.sql.api.java .Row ) objects along with
1519a schema that describes the data types of each column in the row. A SchemaRDD is similar to a table
1620in a traditional relational database. A SchemaRDD can be created from an existing RDD, parquet
1721file, or by running HiveQL against data stored in [ Apache Hive] ( http://hive.apache.org/ ) .
1822
1923** All of the examples on this page use sample data included in the Spark distribution and can be run in the spark-shell.**
2024
25+ </div >
26+
27+ <div data-lang =" java " markdown =" 1 " >
28+ Spark SQL allows relational queries expressed in SQL, HiveQL, or Scala to be executed using
29+ Spark. At the core of this component is a new type of RDD,
30+ [ JavaSchemaRDD] ( api/sql/core/index.html#org.apache.spark.sql.api.java.JavaSchemaRDD ) . JavaSchemaRDDs are composed
31+ [ Row] ( api/sql/catalyst/index.html#org.apache.spark.sql.api.java.Row ) objects along with
32+ a schema that describes the data types of each column in the row. A JavaSchemaRDD is similar to a table
33+ in a traditional relational database. A JavaSchemaRDD can be created from an existing RDD, parquet
34+ file, or by running HiveQL against data stored in [ Apache Hive] ( http://hive.apache.org/ ) .
35+ </div >
36+ </div >
37+
2138***************************************************************************************************
2239
2340# Getting Started
@@ -195,11 +212,6 @@ teenagers.collect().foreach(println)
195212
196213<div data-lang =" java " markdown =" 1 " >
197214
198- One type of table that is supported by Spark SQL is an RDD of JavaBeans. The BeanInfo
199- defines the schema of the table. Currently, Spark SQL does not support JavaBeans that contain
200- nested or contain complex types such as Lists or Arrays. You can create a JavaBean by creating a
201- class that implements Serializable and has getters and setters for all of its fields.
202-
203215{% highlight java %}
204216
205217JavaSchemaRDD schemaPeople = ... // The JavaSchemaRDD from the previous example.
@@ -273,11 +285,11 @@ val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
273285// Importing the SQL context gives access to all the public SQL functions and implicit conversions.
274286import hiveContext._
275287
276- sql ("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
277- sql ("LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src")
288+ hql ("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
289+ hql ("LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src")
278290
279291// Queries are expressed in HiveQL
280- sql(" SELECT key, value FROM src ").collect().foreach(println)
292+ hql("FROM src SELECT key, value").collect().foreach(println)
281293{% endhighlight %}
282294
283295</div >
0 commit comments