Skip to content

Conversation

@hayesgm
Copy link

@hayesgm hayesgm commented Sep 10, 2014

This patch adds native streaming support for redis through the Pub/Sub system. When creating the streaming interface, the user specifies channels and patterns to watch. The streaming RDD then receives all messages which match those channels and patterns in the redis database.

I have attempted to match the process used by the other external streaming plugins. I have added test cases to verify correctness of this plugin. I would appreciate any feedback required to have this merged into master.

Note, this patch uses the rediscala library to communicate with redis, and thus uses the "akka 2.2" branch to match spark's Akka version and reduce conflict.

Additionally, I own all source code included in this patch and release all of the source code under the license of Apache Spark (Apache 2.0).

@SparkQA
Copy link

SparkQA commented Sep 10, 2014

Can one of the admins verify this patch?

@nchammas
Copy link
Contributor

Wow, this seems like a very nice feature add!

By the way, is there a matching JIRA issue for this PR? I'm guessing the maintainers will probably ask for one. Heads up @tdas!

@SparkQA
Copy link

SparkQA commented Sep 11, 2014

QA tests have started for PR 2348 at commit 3cb3b21.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 11, 2014

QA tests have finished for PR 2348 at commit 3cb3b21.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class RedisInputDStream(
    • class RedisReceiver(

@nchammas
Copy link
Contributor

@hayesgm You can see what tests failed in the link at the start of Spark QA's most recent message. You can also run these tests locally by running ./dev/run-tests.

@hayesgm
Copy link
Author

hayesgm commented Sep 12, 2014

Thanks; I'll take a look at those.

@tdas
Copy link
Contributor

tdas commented Sep 23, 2014

@hayesgm This is a wonderful addition, and its really wonderful to see such community contributions! However, there are some unfortunate practical limitation. Since there are numerous such end-systems to which one can push data to, having code to interface with too many such systems may be unmanageable in the long run. We are still in the middle of figuring out our policy here. Hopefully we can figure out some way to make this contribution useable. I will let you know what comes out of our policy discussions.

@hayesgm
Copy link
Author

hayesgm commented Sep 24, 2014

@tdas Thanks for looking into this. Let me know how you would like to proceed.

@jpe42
Copy link

jpe42 commented Nov 17, 2014

+1 This would be really helpful.

Is this still on the roadmap to be included?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@hayesgm
Copy link
Author

hayesgm commented Dec 11, 2014

I'd be happy to push this production ready, but I have not been privy to the conversations as to whether this is something the core team is interested in. @tdas please advise if you can.

@tdas
Copy link
Contributor

tdas commented Dec 25, 2014

@hayesgm Thank you very much for your patience. After much discussion, we have figured out a way such that active community members like you can add features in the Spark ecosystem and maintain themselves. This allows the community to develop and share applications on Spark at a faster pace than we can accept features in Spark and its subprojects. Please consider adding this very very useful functionality to spark-packages.org so that others can take advantage of your contributions.

In the meantime, mind closing this PR?

@nchammas
Copy link
Contributor

Clickable link for the lazy: Spark Packages

@asfgit asfgit closed this in 534f24b Dec 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants