-
Notifications
You must be signed in to change notification settings - Fork 29k
[STREAMING] Add redis pub/sub streaming support #2348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
Wow, this seems like a very nice feature add! By the way, is there a matching JIRA issue for this PR? I'm guessing the maintainers will probably ask for one. Heads up @tdas! |
|
QA tests have started for PR 2348 at commit
|
|
QA tests have finished for PR 2348 at commit
|
|
@hayesgm You can see what tests failed in the link at the start of Spark QA's most recent message. You can also run these tests locally by running |
|
Thanks; I'll take a look at those. |
|
@hayesgm This is a wonderful addition, and its really wonderful to see such community contributions! However, there are some unfortunate practical limitation. Since there are numerous such end-systems to which one can push data to, having code to interface with too many such systems may be unmanageable in the long run. We are still in the middle of figuring out our policy here. Hopefully we can figure out some way to make this contribution useable. I will let you know what comes out of our policy discussions. |
|
@tdas Thanks for looking into this. Let me know how you would like to proceed. |
|
+1 This would be really helpful. Is this still on the roadmap to be included? |
|
Can one of the admins verify this patch? |
|
I'd be happy to push this production ready, but I have not been privy to the conversations as to whether this is something the core team is interested in. @tdas please advise if you can. |
|
@hayesgm Thank you very much for your patience. After much discussion, we have figured out a way such that active community members like you can add features in the Spark ecosystem and maintain themselves. This allows the community to develop and share applications on Spark at a faster pace than we can accept features in Spark and its subprojects. Please consider adding this very very useful functionality to spark-packages.org so that others can take advantage of your contributions. In the meantime, mind closing this PR? |
|
Clickable link for the lazy: Spark Packages |
This patch adds native streaming support for redis through the Pub/Sub system. When creating the streaming interface, the user specifies
channelsandpatternsto watch. The streaming RDD then receives all messages which match those channels and patterns in the redis database.I have attempted to match the process used by the other external streaming plugins. I have added test cases to verify correctness of this plugin. I would appreciate any feedback required to have this merged into master.
Note, this patch uses the rediscala library to communicate with redis, and thus uses the "akka 2.2" branch to match spark's Akka version and reduce conflict.
Additionally, I own all source code included in this patch and release all of the source code under the license of Apache Spark (Apache 2.0).