R Interpreter for Zeppelin #208

elbamos · 2015-08-13T07:07:51Z

This is the initial PR for an R Interpreter for Zeppelin. There's still some work to be done (e.g., tests), but its useable, it brings to Zeppelin features from R like its library of statistics and machine learning packages, as well as advanced interactive visualizations. So I'd like to open it up for others to comment and/or become involved.

Summary:

There are two interpreters, one emulates a REPL, the other uses knitr to weave markdown and formatted R output. The two interpreters share a single execution environment.
Visualisations: Besides R's own graphics, this also supports interactive visualizations with googleVis and rCharts. I am working on htmlwidgets (almost done) with the author of that package, and a next-step project is to get Shiny/ggvis working. Sometimes, a visualization won't load until the page is reloaded. I'm not sure why this is.
Licensing: To talk to R, this integrates code forked from rScala. rScala was released with a BSD-license option, and the author's permission was obtained.
Spark: Getting R to share a single spark context with the Spark interpreter group is going to be a project. For right now, the R interpreters live in their own "r" interpreter group, and new spark contexts are created on startup.
Zeppelin Context: Not yet integrated, in significant part because there's no ZeppelinContext to talk to until it lives in the Spark interpreter group.
Documentation: A notebook is included that demonstrates what the interpreter does and how to use it.
Tests: Working on it...

P.S.: This is my first PR on a project of this size; let me know what I messed up and I'll try to fix it ASAP.

martin-g · 2015-08-13T07:45:29Z

r/src/main/scala/org/apache/zeppelin/rinterpreter/KnitRInterpreter.scala

What is the need for double { ?

None - thanks for catching that. At one point I used IntelliJ to reformat the code for Apache styles and it left a bunch of those artifacts. Thought I caught them all.

On Aug 13, 2015, at 3:45 AM, Martin Grigorov [email protected] wrote:

In r/src/main/scala/org/apache/zeppelin/rinterpreter/KnitRInterpreter.scala:

"""opts_knit$set(out.format = 'html',

|results='asis',

|progress = FALSE,

|self.contained = TRUE,

|verbose = FALSE,

|comment = NA,

|tidy = FALSE)

|

|.z.show.googleVis <- function(widget) print(widget, tag = 'chart')

|.z.show.rCharts <- function(widget) widget$show('inline', include_assets=TRUE, cdn=TRUE)

| """.stripMargin)

}

logger.info("KnitR: Finished initial commands")

}

def interpret(st: String, context: InterpreterContext): InterpreterResult = try { {
What is the need for double { ?

—
Reply to this email directly or view it on GitHub.

elbamos · 2015-08-25T21:27:54Z

The check is failing because R is missing in Travis-this is ready for people to start playing with.

elbamos · 2015-08-27T12:16:30Z

Ok I'm temporarily giving up on Travis - these test failures seem to all be spurious.

elbamos · 2015-09-06T19:08:15Z

All: This should now be functional. Tests pass (there may still be some spurious errors on Travis, but I'm not seeing them). And it should run out-of-box. People should definitely start looking at it.

jabowles · 2015-09-22T22:47:00Z

On the basic RInterpreter notebook, the first paragraph fails with:

could not find function "repr"
The following paragraph hits "knit" and fails with
org.apache.zeppelin.rinterpreter.rscala.RException
at the .sparkREnv directive.

What assumptions/prerequisites is it looking for in the installation?

elbamos · 2015-09-22T22:50:15Z

If you look in the log, it should show what packages are missing. Repr shouldn't be needed unless something else is going wrong.

Can you send me your complete log?

It sounds like what's happening is that initialization is failing if some packages are missing and that definitely has to be fixed.

On Sep 22, 2015, at 6:47 PM, Jeff A. Bowles [email protected] wrote:

On the basic RInterpreter notebook, the first paragraph fails with:

could not find function "repr"
The following paragraph hits "knit" and fails with
org.apache.zeppelin.rinterpreter.rscala.RException
at the .sparkREnv directive.

What assumptions/prerequisites is it looking for in the installation?

—
Reply to this email directly or view it on GitHub.

elbamos · 2015-09-23T02:48:30Z

@jabowles Pls give what I just pushed a try and see if it resolves your issue. I really need a log to be sure I've gotten to the bottom of it, but this may do it.

sourav-mazumder · 2015-10-03T16:19:21Z

@elbamos found it working great in all scenarios involving SparkR, KnitR and various plots/charts. Great work !!!

Leemoonsoo · 2015-11-15T00:51:02Z

@elbamos

Thanks for the contribution and
I briefly reviewed your codes and here's few comment.

1. Scala->R invocation

You seem to use rscala and separate connection to make scala -> r invocation.
I think SparkR already have RBackend that support r -> scala invocation and the same way we do in PySparkInterpreter, this one way connection can be also used to scala -> r without introducing much complexes and additional socket connection.

Using similar technique that PySparkInterpreter does for py4j in R Interpreter will help

Much smaller code base while there are no need for additional socket connection beside the SparkR provided one.
More consistent way with already existing PySparkInterpreter

2. KnitR Inteprreter

Seems like calling knit2html() is the role.
In this case, i think it more make sense to make a function and inject into RInterpreter (like z.show() in SparkInterpreter), rather than making separate interpreter.

3. Author, Copyrights and Maintainer tag

Zeppelin discourages Author tag. as well as Copyright / Maintainer if it is not really necessary.
Your contribution history will kept in commit history and archived in mailing list.

elbamos · 2015-11-15T01:09:16Z

@Leemoonsoo

Please let me know who from the PPMC I can work with to resolve the continuing problems with the travis build and zeppelin-spark architecture.

Regarding your comments, as I've explained in our e-mail exchange:

1. Scala-R invocation

I am not using rscala.

It is not possible to use the SparkR connection in the way you describe. I did look into this early on. There are numerous packages for interfacing the jvm and R. None of them use two-way connections.

The python-spark and zeppelin integrations you describe leverage an external dependency. There is no comparable package available for R that has a compatible license.

2. Knitr*

As I understand, you don't use R, so it may seem strange to have a separate interpreter rather than a function. That's understandable.

The distinction between the r-repl and knitr interpreters makes perfect sense for people who are coming from R. The repl and knitr handle code, and errors, and output, in fundamentally different ways.

They have different capabilities. It is not possible, consistent with the zeppelin architecture, to put both capabilities into a single interpreter without making the use of that interpreter very unintuitive for someone coming from an R background.

The knit2html() command is something no R user would ever use when making use of R. It is perhaps best thought of as part of the "R operating system."

3, The author tag

That's really fine, but in my view this is the lowest-priority possible item.

The highest priority is the travis build problems. Travis consistently fails building parts other than rzeppelin.

The other high priority is consistency with the Zeppelin-Spark interface, which has grown quite complex and difficult to use.

My users have had a long stream of issues getting Spark to work through zeppelin. They get reported to me as rzeppelin issues, but have all turned out to be issues in the way zeppelin and spark interface, e.g., with conflicts between SPARK_HOME and spark.home. rzeppelin needs to be consistent with the rest of the Zeppelin architecture in that regard. This is not something I can fix because I don't own that code.

Leemoonsoo · 2015-11-15T01:56:09Z

I am not using rscala.

It is not possible to use the SparkR connection in the way you describe. I did look into this early on. There are numerous packages for interfacing the jvm and R. None of them use two-way connections.

The python-spark and zeppelin integrations you describe leverage an external dependency. There is no comparable package available for R that has a compatible license.

Correction rscala -> forked-rscala.

You don't need to find any other external package for two-way connection. You can make two-way R<->JVM invocation without any external dependency with the same technique used in PySparkInterpreter.

PySpark implements one way connection Python->JVM similar to SparkR.
PysparkInterpreter leverage this one way connection and successfully made JVM->Python invocation without any external dependency.

I guess RInterpreter can do the same, with SparkR. It'll able to make JVM->R invocation without any dependency and without additional socket connection with the same technique used in PySparkInterpreter.
That'll simplify the code base. More precisely, more than 1000 lines of code.

As I understand, you don't use R, so it may seem strange to have a separate interpreter rather than a function. That's understandable.

The distinction between the r-repl and knitr interpreters makes perfect sense for people who are coming from R. The repl and knitr handle code, and errors, and output, in fundamentally different ways.

They have different capabilities. It is not possible, consistent with the zeppelin architecture, to put both capabilities into a single interpreter without making the use of that interpreter very unintuitive for someone coming from an R background.

The knit2html() command is something no R user would ever use when making use of R. It is perhaps best thought of as part of the "R operating system."

Yes i'm not familiar with R. So please convince me. What KnitR Interpreter doing is basically

     rContext.set(".zeppknitrinput", st.split("\n"))
     rContext.eval(".knitout <- knit2html(text=.zeppknitrinput, envir = rzeppelin:::.zeppenv)")
     rContext.getS0(".knitout")

and basic usage i found from KnitR website is

library(knitr)
?knit
knit(input)

So, to me, it's hard to imagine why functions like z.knite(input) does not make sense.
If you have use cases, please share.

By the way, KnitR is GPL license. I don't think Zeppelin can have a feature that depends on GPL licensed code.

That's really fine, but in my view this is the lowest-priority possible item.

License and Copyright problems are one of the highest priority item in Zeppelin project

The highest priority is the travis build problems. Travis consistently fails building parts other than rzeppelin.

Latest exception from your CI Build is

Caused by: java.lang.OutOfMemoryError: PermGen space
    at org.apache.zeppelin.rinterpreter.RSparkTest$$anonfun$3.apply$mcV$sp(RSparkTest.scala:51)
An exception or error caused a run to abort. This may have been caused by a problematic custom reporter.
Exception in thread "ScalaTest-main" 
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "ScalaTest-main"

Please try to increase PermGem memory option in the test.

My users have had a long stream of issues getting Spark to work through zeppelin. They get reported to me as rzeppelin issues, but have all turned out to be issues in the way zeppelin and spark interface, e.g., with conflicts between SPARK_HOME and spark.home. rzeppelin needs to be consistent with the rest of the Zeppelin architecture in that regard. This is not something I can fix because I don't own that code.

And https://issues.apache.org/jira/browse/ZEPPELIN-421 will address removal of spark.home property, in Zeppelin setting window. But until that, you can simply not trying to set spark.home.

And technically you don't own Zeppelin code but ASF does. but nothing stops you fix the problem.

Leemoonsoo · 2015-11-15T02:34:23Z

Another one, is location of this package.

If it is implementing general R Interpreter, i think having separate submodule, like as-is make sense.
But if it is being in the same 'spark' interpreter group and implementing SparkR interpreter, that shares resources (eg. SparkContext) with SparkInterpreter, i think it more make sense to move into 'spark' submodule instead of having separate 'R' submodule.

Also, this approach is more consistent with package location of PySparkInterpreter, which is also part of Spark Interpreter group.

elbamos · 2015-11-15T22:37:33Z

@Leemoonsoo

I will be rebasing over the next few days.

I the meantime I'm going to go through the issues you raised in sequence...

1. The R-Scala Interface

I've explained several times that the proposal to use SparkR bi-directionally doesn't work. I don't feel that I have more to add about that.

I will try to reduce the size of the code that originated in rscala.

If this is not going to be acceptable to the PCCM, please tell me now.

2. KnitR

What you're proposing is that users enter the same boilerplate, which they would have to figure out for themselves, every time they want to use knitr.

Knitr and the repl are fundamentally two different ways for users to interact with R. They have very different behaviors in terms of error reporting and handling visualizations.

If you don't want to trust me about this, then I suggest we ask some other R users what makes the most sense.

If this is not going to be acceptable to the PCCM, please tell me now.

3. KnitR GPL License

By the way, KnitR is GPL license. I don't think Zeppelin can have a feature that depends on GPL licensed code.

KnitR is an optional external dependency. This is not a licensing problem.

It is also not a licensing problem to interact with GPL code that isn't supplied with Zeppelin.

For example, R itself is GPL code. So it Zeppelin cannot interact with external GPL code, then there cannot be an R interpreter in Zeppelin at all.

Considering that Spark interacts with R, I think this issue is closed.

4. License any copyright

License and Copyright problems are one of the highest priority item in Zeppelin project

Huh? What we were talking about is who gets identified in code as the author. That is obviously not a license/copyright issue, its an issue of credit.

You said that it is discouraged to identify anybody as the author in Apache projects.

However, the current code does identify authors, with you identified as the principal author.

So, at this point I'm not sure what you're referring to?

5. Location of the Package

Its fine with me if it goes under /spark. The reason its in /r is that it simplifies testing and development. Someone will have to merge the two build scripts; I'm using scalatest for testing.

6. Travis Builds

Actually the error in the travis logs begins with this:

15/09/23 03:12:22 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/09/23 03:12:24 WARN SparkInterpreter: Can't create HiveContext. Fallback to SQLContext
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45> )
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)

That's not coming from rzeppelin. That's coming from the SparkInterpreter when rzeppelin asks it to initiate a spark backend.

This is what I mean about issues in the spark-zeppelin interface.

elbamos · 2015-11-15T23:25:43Z

@Leemoonsoo @sourav-mazumder @Emaasit @felixcheung

Actually, I just watched the youtube video in which, in September, Lee and Felix demo'd this code at the Seattle R users group.

Lee, I will be e-mailing you my phone number. Call me.

jongyoul · 2015-11-16T03:07:34Z

@elbamos Hi, You're misunderstanding the authorship of codes from Apache project including Zeppelin. What @Leemoonsoo said to you is we have tracked your contribution "with git commit logs" not being in codes. You should remove your developer tag in the r/pom.xml and delete the files among r/R/rzeppelin/LICENSE. If you want to describe some instructions like r/R/rzeppelin/DESCRIPTION and r/R/rzeppelin/README, please use our documentation rules. You should also delete r/R/rzeppelin/NEWS for the same reason. And it's a virtue of not re-indenting any codes you are not editing even though it's wrong. And you divide the supporting of spark 1.4.2 and the being changeable the version of scala to separate issues. This also be a virtue that one issue resolves only one problem. Even if your codes work perfectly but your contribution ignores general ways. I'm a expert of R, thus I don't know how your codes are good but you should consider several things before you contribute to Zeppelin and another Apache project.

@Leemoonsoo I think that we need to enrich our documentations. The title of this PR doesn't have any connected number of issue from JIRA.

elbamos · 2015-11-16T03:32:25Z

@jongyoul

I have no idea what you were trying to say.

It appears to me that your primary concern is the names identified in the pom files as the authors? Is that correct?

(Many of the files you are referencing, under rzeppelin/, are part of the source that has to be there for an R package.)

It appears to me that I have contributed in the correct manner, but there may be some people affiliated with this project who have other issues.

jongyoul · 2015-11-16T03:55:51Z

@elbamos Could you please tell me the usage of r/src/main/scala/scala/Console.scala? I can't find the codes calls Console. And If you want to include some files not to check in apache-rat, you should tell us the reason. This is even a source file and will be included by binary.

jongyoul · 2015-11-16T03:58:58Z

@elbamos Do you mean that it's ok because of locating the files under the rzeppelin/, and do you mean rzeppelin is not a part of the contribution?

elbamos · 2015-11-16T03:59:51Z

@jongyoul What's excluded from rat are files that get generated automatically during the build process -- they are the R equivalent of a .class or .jar file.

There are two rzeppelin folders. One contains R source code. The other contains the actual compiled R package.

Console.scala is used to create a virtual interface to R. I believe its called by an implicit, but that is actually part from rscala that I haven't tinkered with.

bzz · 2016-04-02T08:45:30Z

I believe that rebase to latest master must eliminate most of CI issues.

Did one more round of reviews and updated #208 (comment) - a minor cleanup of licenses is needed.

elbamos · 2016-04-02T22:13:40Z

@bzz We are green! Good to go?

bzz · 2016-04-03T01:21:52Z

pom.xml

              <exclude>docs/Gemfile.lock</exclude>
+
+	      <!-- compiled R packages (binaries) -->
+	      <exclude>R/**</exclude>


Does not this exclude whole ./r/ maven sub-module (assuming cases-insensitive FS)?
AFAIK compiled R packages now live under ~/R and fine-grained excludes should be (and already are) managed by ./r/pom.xml

bzz · 2016-04-03T01:27:27Z

Looks awesome!
I think there are just 2 minor licence issue that are left in the #208 (comment) now.

Copied them here for convenience:

remove <exclude>R/**</exclude> from the root ./pom.xml as it excludes whole maven R sub-modulue
as per advice from IPMC, please remove full text of the non-apache license that you have added before (r/R/rzeppelin/R/common.Rr/R/rzeppelin/R/common.R, r/R/rzeppelin/R/globals.R, etc)

and we should be good to go!

elbamos · 2016-04-03T05:40:11Z

@bzz The push I just made should reflect all of the latest requests. Thank you for pointing out the effect of a case-insensitive FS. There are no substantive changes in this push, so if its going red, its just the random-CI stuff.

@bzz

Working on CI CI CI CI CI permissions CI Should be good Triggering CI squashme - force push squashme CI Removing unused dependency squashme squashme squashme squashme squashme squashme License changes requested by @bzz squashme

elbamos · 2016-04-03T20:38:33Z

@bzz Are we good?

### What is this PR for? Improvement of Virtual Machine Script to support R interpretor ### What type of PR is it? Improvement of Virtual Machine Script to support R interpretor ### Todos * [x] - Test with #208 * [x] - Test with #702 ### Is there a relevant Jira issue? Zeppelin-700 ### How should this be tested? Follow the steps in this Read Me to build a VM from scratch: https://github.com/apache/incubator-zeppelin/blob/master/scripts/vagrant/zeppelin-dev/README.md ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? Added to all headers * Is there breaking changes for older versions? No * Does this needs documentation? Yes, this will be a separate PR to update docs and README Author: Jeff Steinmetz <[email protected]> Closes #751 from jeffsteinmetz/ZEPPELIN-700 and squashes the following commits: e03dba5 [Jeff Steinmetz] update to support R interpreter sparkr build profile a9a2052 [Jeff Steinmetz] add base64enc, repr and htmltools bdcdf5f [Jeff Steinmetz] removed packages not required for pr702. repr not needed - base64encode not needed - htmltools not needed d497994 [Jeff Steinmetz] fix so that devtools will install 5643fc6 [Jeff Steinmetz] plotting in r interpreter requires the repr package. install devtools package first so repr can be installed. fb18a2b [Jeff Steinmetz] plotting in r interpreter requires the repr package. install devtools package first so repr can be installed. ef8f638 [Jeff Steinmetz] ZEPPELIN-700. Add Ansible R role to Virtual Machine to support dependencies for the R Interpreter b940bcd [Jeff Steinmetz] ZEPPELIN-700. Add Ansible R role to Virtual Machine to support dependencies for the R Interpreter

### What is this PR for? Improvement of Virtual Machine Script to support R interpretor ### What type of PR is it? Improvement of Virtual Machine Script to support R interpretor ### Todos * [x] - Test with apache#208 * [x] - Test with apache#702 ### Is there a relevant Jira issue? Zeppelin-700 ### How should this be tested? Follow the steps in this Read Me to build a VM from scratch: https://github.com/apache/incubator-zeppelin/blob/master/scripts/vagrant/zeppelin-dev/README.md ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? Added to all headers * Is there breaking changes for older versions? No * Does this needs documentation? Yes, this will be a separate PR to update docs and README Author: Jeff Steinmetz <[email protected]> Closes apache#751 from jeffsteinmetz/ZEPPELIN-700 and squashes the following commits: e03dba5 [Jeff Steinmetz] update to support R interpreter sparkr build profile a9a2052 [Jeff Steinmetz] add base64enc, repr and htmltools bdcdf5f [Jeff Steinmetz] removed packages not required for pr702. repr not needed - base64encode not needed - htmltools not needed d497994 [Jeff Steinmetz] fix so that devtools will install 5643fc6 [Jeff Steinmetz] plotting in r interpreter requires the repr package. install devtools package first so repr can be installed. fb18a2b [Jeff Steinmetz] plotting in r interpreter requires the repr package. install devtools package first so repr can be installed. ef8f638 [Jeff Steinmetz] ZEPPELIN-700. Add Ansible R role to Virtual Machine to support dependencies for the R Interpreter b940bcd [Jeff Steinmetz] ZEPPELIN-700. Add Ansible R role to Virtual Machine to support dependencies for the R Interpreter

bzz · 2016-04-04T01:12:26Z

@elbamos awesome, thank you! Please, let me do a final pass and I'll post back ASAP.

Just FYI - I'm still working on fixing random CI failures under ZEPPELIN-783 (not directly related\blocking this work)

elbamos · 2016-04-04T01:58:42Z

@bzz thanks!

Yeah, I was going to suggest, that getting CI to work reliably is probably something important to the project beyond the scope of this PR.

One thing that might offer a quick improvement, is @jeffsteinmetz's suggestion that we stop supporting older versions of Spark (e.g., 1.1, 1.2). That should reduce the random test failures by at least 20%.

bzz · 2016-04-04T02:11:51Z

@elbamos everything looks good, except one last thing - BSD license text need to be removed from r/src/main/scala/scala/Console.scala header as well, as it was not there originally and we have it in a separate file, clearly pointed from root LICENSE.

Sorry for not pointing it out earlier, but reviewing whole 4000+LOC diff everytime at once is a bit uneasy.

I think we should be good with merging it right after that. And yes, improving project CI is a priority and will be continued.

elbamos · 2016-04-04T04:31:00Z

@bzz So we should be good to merge now right?

bzz · 2016-04-04T13:38:49Z

Looks good to me. thank you for the efforts.

I think it is ready to be merged, and if there is no more discussion - I will merge it tomorrow.

elbamos · 2016-04-04T17:31:14Z

@bzz the push I just made fixed an issue that if -Pr wasn't specified, rat would complain about the licenses of files under the r/ subdirectory. There are no other changes.

Leemoonsoo · 2016-04-05T07:32:52Z

+1

### What is this PR for? Recently #208 and #702 are merged into master branch. They passed the CI test individually, but failing after both merged. * zeppelin-web build error ``` [ERROR] npm ERR! registry error parsing json [ERROR] npm http 200 https://registry.npmjs.org/bower/1.7.2 [ERROR] npm ERR! SyntaxError: Unexpected token � [ERROR] npm ERR! �ï¿½��ï¿½Y[oï¿½6�ï¿½+ï¿½ï¿½6ï¿½ï¿½u��lEï¿½ï¿½ï¿½ [ERROR] npm ERR! ï¿½{hï¿½ï¿½Cï¿½Ë‘ï¿½L�=RJï¿½�ï¿½oï¿½9bï¿½4ï¿½ï¿½ï¿½ï¿½W4ï¿½�["ï¿½ï¿½wnï¿½�ï¿½�ï¿½Eï¿½ï¿½�ï¿½2Cï¿½Ï•nï¿½ï¿½ï¿½U`ï¿½ï¿½�aï¿½ [ERROR] npm ERR! Gï¿½p^ï¿½ï¿½$�e�ï¿½Ley,ï¿½ï¿½IUï¿½"/Kï¿½,qrï¿½[8ï¿½Fï¿½ï¿½ï¿½^ï¿½ï¿½ï¿½pï¿½�ï¿½ï¿½ï¿½ï¿½ï¿½Zï¿½ï¿½ï¿½ï¿½ï¿½=x?ï¿½}ï¿½ï¿½{Wï¿½+ï¿½ï¿½ï¿½Ü³Ð€ìµ±ï¿½ï¿½ï¿½} ``` * 'SparkRInterpreter.java' and 'RRepl.java' uses the same interpreter name. 'spark.r'. That conflicts and make https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-server/src/test/java/org/apache/zeppelin/rest/ZeppelinSparkClusterTest.java#L87 fails. * R.md and r.md both exists under same directory. That confuses git client. ### What type of PR is it? Hot Fix ### Todos * [x] - Merge R.md and r.md * [x] - Fix zeppelin-web build error * [x] - Change interpreter listing order ### What is the Jira issue? ### How should this be tested? ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <[email protected]> This patch had conflicts when merged, resolved by Committer: Lee moon soo <[email protected]> Closes #815 from Leemoonsoo/r_hotfix and squashes the following commits: eeb411e [Lee moon soo] Change interpreter listing order 9baf57b [Lee moon soo] Change node and npm version 6854ac7 [Lee moon soo] R.md -> r.md

### What is this PR for? It forces travis CI to use container-based infra for everything, not only `master`. It will result in [many benefits](https://docs.travis-ci.com/user/migrating-from-legacy/#Why-migrate-to-container-based-infrastructure%3F), improving CI stability and including reducing [obscure CI failures \w old Spark versions](https://issues.apache.org/jira/browse/ZEPPELIN-776?focusedCommentId=15219388&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15219388) ### What type of PR is it? Improvement, Hotfix for apache#208 ### What is the Jira issue? [ZEPPELIN-776](https://issues.apache.org/jira/browse/ZEPPELIN-776) ### How should this be tested? CI should be green, that is all. ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Alexander Bezzubov <[email protected]> Closes apache#808 from bzz/ZEPPELIN-776-force-ci-to-container-infra and squashes the following commits: 0e550f6 [Alexander Bezzubov] Remove obsolete notifications hook d56ed0d [Alexander Bezzubov] ZEPPELIN-776: force CI to always use container-based infra

### What is this PR for? Improvement of Virtual Machine Script to support R interpretor ### What type of PR is it? Improvement of Virtual Machine Script to support R interpretor ### Todos * [x] - Test with apache#208 * [x] - Test with apache#702 ### Is there a relevant Jira issue? Zeppelin-700 ### How should this be tested? Follow the steps in this Read Me to build a VM from scratch: https://github.com/apache/incubator-zeppelin/blob/master/scripts/vagrant/zeppelin-dev/README.md ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? Added to all headers * Is there breaking changes for older versions? No * Does this needs documentation? Yes, this will be a separate PR to update docs and README Author: Jeff Steinmetz <[email protected]> Closes apache#751 from jeffsteinmetz/ZEPPELIN-700 and squashes the following commits: e03dba5 [Jeff Steinmetz] update to support R interpreter sparkr build profile a9a2052 [Jeff Steinmetz] add base64enc, repr and htmltools bdcdf5f [Jeff Steinmetz] removed packages not required for pr702. repr not needed - base64encode not needed - htmltools not needed d497994 [Jeff Steinmetz] fix so that devtools will install 5643fc6 [Jeff Steinmetz] plotting in r interpreter requires the repr package. install devtools package first so repr can be installed. fb18a2b [Jeff Steinmetz] plotting in r interpreter requires the repr package. install devtools package first so repr can be installed. ef8f638 [Jeff Steinmetz] ZEPPELIN-700. Add Ansible R role to Virtual Machine to support dependencies for the R Interpreter b940bcd [Jeff Steinmetz] ZEPPELIN-700. Add Ansible R role to Virtual Machine to support dependencies for the R Interpreter

This is the initial PR for an R Interpreter for Zeppelin. There's still some work to be done (e.g., tests), but its useable, it brings to Zeppelin features from R like its library of statistics and machine learning packages, as well as advanced interactive visualizations. So I'd like to open it up for others to comment and/or become involved. Summary: - There are two interpreters, one emulates a REPL, the other uses knitr to weave markdown and formatted R output. The two interpreters share a single execution environment. - Visualisations: Besides R's own graphics, this also supports interactive visualizations with googleVis and rCharts. I am working on htmlwidgets (almost done) with the author of that package, and a next-step project is to get Shiny/ggvis working. Sometimes, a visualization won't load until the page is reloaded. I'm not sure why this is. - Licensing: To talk to R, this integrates code forked from rScala. rScala was released with a BSD-license option, and the author's permission was obtained. - Spark: Getting R to share a single spark context with the Spark interpreter group is going to be a project. For right now, the R interpreters live in their own "r" interpreter group, and new spark contexts are created on startup. - Zeppelin Context: Not yet integrated, in significant part because there's no ZeppelinContext to talk to until it lives in the Spark interpreter group. - Documentation: A notebook is included that demonstrates what the interpreter does and how to use it. - Tests: Working on it... P.S.: This is my first PR on a project of this size; let me know what I messed up and I'll try to fix it ASAP. Author: Amos Elb <[email protected]> Author: Amos B. Elberg <[email protected]> Closes apache#208 from elbamos/rinterpreter and squashes the following commits: ffc1a25 [Amos Elb] Fix rat issue a08ec5b [Amos B. Elberg] R Interpreter

### What is this PR for? Recently apache#208 and apache#702 are merged into master branch. They passed the CI test individually, but failing after both merged. * zeppelin-web build error ``` [ERROR] npm ERR! registry error parsing json [ERROR] npm http 200 https://registry.npmjs.org/bower/1.7.2 [ERROR] npm ERR! SyntaxError: Unexpected token � [ERROR] npm ERR! �ï¿½��ï¿½Y[oï¿½6�ï¿½+ï¿½ï¿½6ï¿½ï¿½u��lEï¿½ï¿½ï¿½ [ERROR] npm ERR! ï¿½{hï¿½ï¿½Cï¿½Ë‘ï¿½L�=RJï¿½�ï¿½oï¿½9bï¿½4ï¿½ï¿½ï¿½ï¿½W4ï¿½�["ï¿½ï¿½wnï¿½�ï¿½�ï¿½Eï¿½ï¿½�ï¿½2Cï¿½Ï•nï¿½ï¿½ï¿½U`ï¿½ï¿½�aï¿½ [ERROR] npm ERR! Gï¿½p^ï¿½ï¿½$�e�ï¿½Ley,ï¿½ï¿½IUï¿½"/Kï¿½,qrï¿½[8ï¿½Fï¿½ï¿½ï¿½^ï¿½ï¿½ï¿½pï¿½�ï¿½ï¿½ï¿½ï¿½ï¿½Zï¿½ï¿½ï¿½ï¿½ï¿½=x?ï¿½}ï¿½ï¿½{Wï¿½+ï¿½ï¿½ï¿½Ü³Ð€ìµ±ï¿½ï¿½ï¿½} ``` * 'SparkRInterpreter.java' and 'RRepl.java' uses the same interpreter name. 'spark.r'. That conflicts and make https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-server/src/test/java/org/apache/zeppelin/rest/ZeppelinSparkClusterTest.java#L87 fails. * R.md and r.md both exists under same directory. That confuses git client. ### What type of PR is it? Hot Fix ### Todos * [x] - Merge R.md and r.md * [x] - Fix zeppelin-web build error * [x] - Change interpreter listing order ### What is the Jira issue? ### How should this be tested? ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <[email protected]> This patch had conflicts when merged, resolved by Committer: Lee moon soo <[email protected]> Closes apache#815 from Leemoonsoo/r_hotfix and squashes the following commits: eeb411e [Lee moon soo] Change interpreter listing order 9baf57b [Lee moon soo] Change node and npm version 6854ac7 [Lee moon soo] R.md -> r.md

### What is this PR for? This PR apply fix #769 again, which is reverted by #208. Also removing unnecessary code from interpreter.sh ### What type of PR is it? Bug Fix ### Todos * [x] - Apply #769 again * [x] - Remove unnecessary code ### What is the Jira issue? ### How should this be tested? ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <[email protected]> Closes #889 from Leemoonsoo/fix_interpreter_sh_classpath and squashes the following commits: 8468fd5 [Lee moon soo] Remove unnecessary construction of classpath ea8fee8 [Lee moon soo] Apply pr769 again, reverted by pr208

### What is this PR for? This PR apply fix apache#769 again, which is reverted by apache#208. Also removing unnecessary code from interpreter.sh ### What type of PR is it? Bug Fix ### Todos * [x] - Apply apache#769 again * [x] - Remove unnecessary code ### What is the Jira issue? ### How should this be tested? ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <[email protected]> Closes apache#889 from Leemoonsoo/fix_interpreter_sh_classpath and squashes the following commits: 8468fd5 [Lee moon soo] Remove unnecessary construction of classpath ea8fee8 [Lee moon soo] Apply pr769 again, reverted by pr208

martin-g reviewed Aug 13, 2015
View reviewed changes

elbamos force-pushed the rinterpreter branch 2 times, most recently from 353eaa6 to 673b7bd Compare August 25, 2015 21:22

elbamos force-pushed the rinterpreter branch from 87f2232 to efd5b6e Compare August 26, 2015 15:06

elbamos force-pushed the rinterpreter branch from 1dcab63 to 4f372e3 Compare August 28, 2015 22:43

elbamos force-pushed the rinterpreter branch 2 times, most recently from ffeecb5 to c078252 Compare September 6, 2015 19:05

elbamos force-pushed the rinterpreter branch 5 times, most recently from 00433bb to 2e1fe2e Compare September 15, 2015 06:05

elbamos force-pushed the rinterpreter branch from 885e1d5 to 6b5a6a6 Compare April 2, 2016 18:04

bzz reviewed Apr 3, 2016
View reviewed changes

R Interpreter

a08ec5b

Working on CI CI CI CI CI permissions CI Should be good Triggering CI squashme - force push squashme CI Removing unused dependency squashme squashme squashme squashme squashme squashme License changes requested by @bzz squashme

elbamos force-pushed the rinterpreter branch from c3ac7ee to a08ec5b Compare April 3, 2016 06:05

Fix rat issue

ffc1a25

elbamos force-pushed the rinterpreter branch from dcc4ae3 to ffc1a25 Compare April 4, 2016 17:29

asfgit closed this in d5e87fb Apr 5, 2016

Leemoonsoo mentioned this pull request Apr 5, 2016

[WIP] Two SparkR implementation conflict in master branch #815

Closed

3 tasks

Leemoonsoo mentioned this pull request May 12, 2016

Fix interpreter.sh classpath #889

Closed

2 tasks

R Interpreter for Zeppelin #208

R Interpreter for Zeppelin #208

Uh oh!

Conversation

elbamos commented Aug 13, 2015

Uh oh!

martin-g Aug 13, 2015

Choose a reason for hiding this comment

Uh oh!

elbamos Aug 13, 2015

Choose a reason for hiding this comment

Uh oh!

elbamos commented Aug 25, 2015

Uh oh!

elbamos commented Aug 27, 2015

Uh oh!

elbamos commented Sep 6, 2015

Uh oh!

jabowles commented Sep 22, 2015

Uh oh!

elbamos commented Sep 22, 2015

Uh oh!

elbamos commented Sep 23, 2015

Uh oh!

sourav-mazumder commented Oct 3, 2015

Uh oh!

Leemoonsoo commented Nov 15, 2015

1. Scala->R invocation

2. KnitR Inteprreter

3. Author, Copyrights and Maintainer tag

Uh oh!

elbamos commented Nov 15, 2015

1. Scala-R invocation

2. Knitr*

3, The author tag

Uh oh!

Leemoonsoo commented Nov 15, 2015

Uh oh!

Leemoonsoo commented Nov 15, 2015

Uh oh!

elbamos commented Nov 15, 2015

1. The R-Scala Interface

2. KnitR

3. KnitR GPL License

4. License any copyright

5. Location of the Package

6. Travis Builds

Uh oh!

elbamos commented Nov 15, 2015

Uh oh!

jongyoul commented Nov 16, 2015

Uh oh!

elbamos commented Nov 16, 2015

Uh oh!

jongyoul commented Nov 16, 2015

Uh oh!

jongyoul commented Nov 16, 2015

Uh oh!

elbamos commented Nov 16, 2015

Uh oh!

bzz commented Apr 2, 2016

Uh oh!

elbamos commented Apr 2, 2016

Uh oh!

bzz Apr 3, 2016

Choose a reason for hiding this comment

Uh oh!

bzz commented Apr 3, 2016

Uh oh!

elbamos commented Apr 3, 2016

Uh oh!

elbamos commented Apr 3, 2016

Uh oh!

bzz commented Apr 4, 2016

Uh oh!

elbamos commented Apr 4, 2016

Uh oh!

bzz commented Apr 4, 2016

Uh oh!

elbamos commented Apr 4, 2016