From 11aafd5a3386956a6769cbce9756f7028691028e Mon Sep 17 00:00:00 2001 From: fe2s Date: Tue, 30 Oct 2018 15:54:22 +0200 Subject: [PATCH 1/6] #106, document library usage in Java: initial doc --- README.md | 1 + doc/java.md | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) create mode 100644 doc/java.md diff --git a/README.md b/README.md index b348d77d..70dceba4 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,7 @@ This library is work in progress so the API may change before the official relea - [Dataframe](doc/dataframe.md) - [Streaming](doc/streaming.md) - [Cluster](doc/cluster.md) + - [Java](doc/java.md) - [Python](doc/python.md) - [Configuration](doc/configuration.md) diff --git a/doc/java.md b/doc/java.md new file mode 100644 index 00000000..561f29e9 --- /dev/null +++ b/doc/java.md @@ -0,0 +1,88 @@ +# Using the library in Java + +The library is written in Scala and the API is primarily intended to be used with Scala. But you can also use the library in +Java because of the Scala/Java interoperability. + + +## RDD + +Please, refer to the detailed documentation of [RDD support](#rdd.md) for the full list of available features. +The RDD functions are available in the following way: + +```java +SparkConf sparkConf = new SparkConf() + .setAppName("MyApp") + .setMaster("local[*]") + .set("spark.redis.host", "localhost") + .set("spark.redis.port", "6379"); + +RedisConfig redisConfig = RedisConfig.fromSparkConf(sparkConf); +ReadWriteConfig readWriteConfig = ReadWriteConfig.fromSparkConf(sparkConf); + +JavaSparkContext jsc = new JavaSparkContext(sparkConf); +RedisContext redisContext = new RedisContext(jsc.sc()); + +JavaRDD> rdd = jsc.parallelize(Arrays.asList(Tuple2.apply("myKey", "Hello"))); +int ttl = 0; + +redisContext.toRedisKV(rdd.rdd(), ttl, redisConfig, readWriteConfig); + +``` + +## Datasets and DataFrames + +The Dataset/DataFrame API is identical to Scala. Please, refer to [DataFrame page](#dataframe.md) for details. Here is an +example with Java: + +```Java +public class Person { + + private String name; + private Integer age; + + public Person() { + } + + public Person(String name, Integer age) { + this.name = name; + this.age = age; + } + + public String getName() { + return name; + } + + public void setName(String name) { + this.name = name; + } + + public Integer getAge() { + return age; + } + + public void setAge(Integer age) { + this.age = age; + } +} + +``` + +```Java +SparkSession spark = SparkSession + .builder() + .appName("MyApp") + .master("local[*]") + .config("spark.redis.host", "localhost") + .config("spark.redis.port", "6379") + .getOrCreate(); + + Dataset df = spark.createDataFrame(Arrays.asList(new Person("John", 35), new Person("Peter", 40)), Person.class); + + df.write() + .format("org.apache.spark.sql.redis") + .option("table", "person") + .option("key.column", "name") + .mode(SaveMode.Overwrite) + .save(); + +``` From 2b85cf1c9822b3fae4f0b4f555b05ae05067dc8b Mon Sep 17 00:00:00 2001 From: fe2s Date: Tue, 30 Oct 2018 15:56:39 +0200 Subject: [PATCH 2/6] #106 fix markdown links --- doc/java.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/java.md b/doc/java.md index 561f29e9..7c990e8a 100644 --- a/doc/java.md +++ b/doc/java.md @@ -6,7 +6,7 @@ Java because of the Scala/Java interoperability. ## RDD -Please, refer to the detailed documentation of [RDD support](#rdd.md) for the full list of available features. +Please, refer to the detailed documentation of [RDD support](rdd.md) for the full list of available features. The RDD functions are available in the following way: ```java @@ -31,7 +31,7 @@ redisContext.toRedisKV(rdd.rdd(), ttl, redisConfig, readWriteConfig); ## Datasets and DataFrames -The Dataset/DataFrame API is identical to Scala. Please, refer to [DataFrame page](#dataframe.md) for details. Here is an +The Dataset/DataFrame API is identical to Scala. Please, refer to [DataFrame page](dataframe.md) for details. Here is an example with Java: ```Java From af99321d4fcc6f51649cc7fd36b5cd753aa86dd3 Mon Sep 17 00:00:00 2001 From: fe2s Date: Tue, 30 Oct 2018 16:14:43 +0200 Subject: [PATCH 3/6] #106 added streaming section --- doc/java.md | 41 +++++++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/doc/java.md b/doc/java.md index 7c990e8a..8a2e8fd4 100644 --- a/doc/java.md +++ b/doc/java.md @@ -76,13 +76,38 @@ SparkSession spark = SparkSession .config("spark.redis.port", "6379") .getOrCreate(); - Dataset df = spark.createDataFrame(Arrays.asList(new Person("John", 35), new Person("Peter", 40)), Person.class); +Dataset df = spark.createDataFrame(Arrays.asList(new Person("John", 35), new Person("Peter", 40)), Person.class); + +df.write() + .format("org.apache.spark.sql.redis") + .option("table", "person") + .option("key.column", "name") + .mode(SaveMode.Overwrite) + .save(); +``` - df.write() - .format("org.apache.spark.sql.redis") - .option("table", "person") - .option("key.column", "name") - .mode(SaveMode.Overwrite) - .save(); +## Streaming -``` +The following example demonstrates how to create a stream from Redis list `myList`. Please, refer to [Streaming](streaming.md) for more details. + +```java + SparkConf sparkConf = new SparkConf() + .setAppName("MyApp") + .setMaster("local[*]") + .set("redis.host", "localhost") + .set("redis.port", "6379"); + +JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); + +RedisConfig redisConfig = new RedisConfig(new RedisEndpoint(sparkConf)); + +RedisStreamingContext redisStreamingContext = new RedisStreamingContext(jssc.ssc()); +String[] keys = new String[]{"myList"}; +RedisInputDStream> redisStream = + redisStreamingContext.createRedisStream(keys, StorageLevel.MEMORY_ONLY(), redisConfig); + +redisStream.print(); + +jssc.start(); +jssc.awaitTermination(); +``` \ No newline at end of file From 4863546388448c0ab7f814b6f56148925968be31 Mon Sep 17 00:00:00 2001 From: fe2s Date: Tue, 30 Oct 2018 16:17:08 +0200 Subject: [PATCH 4/6] #106 better formatting --- doc/java.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/java.md b/doc/java.md index 8a2e8fd4..8e590a8c 100644 --- a/doc/java.md +++ b/doc/java.md @@ -76,7 +76,9 @@ SparkSession spark = SparkSession .config("spark.redis.port", "6379") .getOrCreate(); -Dataset df = spark.createDataFrame(Arrays.asList(new Person("John", 35), new Person("Peter", 40)), Person.class); +Dataset df = spark.createDataFrame(Arrays.asList( + new Person("John", 35), + new Person("Peter", 40)), Person.class); df.write() .format("org.apache.spark.sql.redis") From 660e4a9d99f74e97247aabf95b567ec3fd6ab3ef Mon Sep 17 00:00:00 2001 From: fe2s Date: Tue, 30 Oct 2018 16:20:14 +0200 Subject: [PATCH 5/6] #106 better formatting --- doc/java.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/java.md b/doc/java.md index 8e590a8c..6e82c3ef 100644 --- a/doc/java.md +++ b/doc/java.md @@ -7,7 +7,7 @@ Java because of the Scala/Java interoperability. ## RDD Please, refer to the detailed documentation of [RDD support](rdd.md) for the full list of available features. -The RDD functions are available in the following way: +The RDD functions are available in `RedisContext`. Example: ```java SparkConf sparkConf = new SparkConf() @@ -93,11 +93,11 @@ df.write() The following example demonstrates how to create a stream from Redis list `myList`. Please, refer to [Streaming](streaming.md) for more details. ```java - SparkConf sparkConf = new SparkConf() - .setAppName("MyApp") - .setMaster("local[*]") - .set("redis.host", "localhost") - .set("redis.port", "6379"); +SparkConf sparkConf = new SparkConf() + .setAppName("MyApp") + .setMaster("local[*]") + .set("redis.host", "localhost") + .set("redis.port", "6379"); JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); From 3fa62c849f5929eb06d87005f738571932f07b16 Mon Sep 17 00:00:00 2001 From: fe2s Date: Tue, 30 Oct 2018 16:21:26 +0200 Subject: [PATCH 6/6] #106 wording --- doc/java.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/java.md b/doc/java.md index 6e82c3ef..aab522d2 100644 --- a/doc/java.md +++ b/doc/java.md @@ -31,7 +31,7 @@ redisContext.toRedisKV(rdd.rdd(), ttl, redisConfig, readWriteConfig); ## Datasets and DataFrames -The Dataset/DataFrame API is identical to Scala. Please, refer to [DataFrame page](dataframe.md) for details. Here is an +The Dataset/DataFrame API is the same in Java and Scala. Please, refer to [DataFrame page](dataframe.md) for details. Here is an example with Java: ```Java