[SPARK-25586][MLlib][Core] Replace toString method with summarize m… #22604

ankuriitg · 2018-10-01T21:15:36Z

…ethod in GeneralizedLinearRegressionTrainingSummary

What changes were proposed in this pull request?

After the change in SPARK-25118 (not submitted yet), which enables spark-shell to run with default
log level, test_glr_summary started failing with StackOverflow error.

Cause: ClosureCleaner calls logDebug on various objects and when it is called
for GeneralizedLinearRegressionTrainingSummary, it starts a spark job which runs
into infinite loop and fails with the below exception.

Fix: Remove toString method and move the existing logic to a new method
"summarize". This follows the general guideline as other summary objects do not
have a toString method as well.

How was this patch tested?

Ran the python "mllib" and scala/junit tests for this module

…ethod in GeneralizedLinearRegressionTrainingSummary After the change in SPARK-25118, which enables spark-shell to run with default log level, test_glr_summary started failing with StackOverflow error. Cause: ClosureCleaner calls logDebug on various objects and when it is called for GeneralizedLinearRegressionTrainingSummary, it starts a spark job which runs into infinite loop and fails with the below exception. Fix: Remove toString method and move the existing logic to a new method "summarize". This follows the general guideline as other summary objects do not have a toString method as well. Testing Done: Ran the python "mllib" and scala/junit tests for this module

ankuriitg · 2018-10-01T21:17:39Z

@vanzin @squito @bersprockets

squito · 2018-10-01T22:02:13Z

Jenkins, ok to test

ankuriitg · 2018-10-01T22:13:38Z

@actuaryzhang @yanboliang

srowen · 2018-10-01T22:55:31Z

This just removes toString entirely and adds a new method. I don't think we can do that. What's the minimum change to toString that means it doesn't execute a Spark job? can the same info be preserved and presented otherwise?

Alternatively, we go back to the other JIRA and figure out why it was causing an infinite loop in the first place.

ankuriitg · 2018-10-01T23:06:39Z

Few variables (I think there are 3) printed in toString cause a Spark Job to be started and the main reason is that those variables are lazily evaluated. I can remove those variables from toString altogether.

An alternative fix will be to change those lazily evaluated variables to instantaneously evaluated but that may cause some other issues that I am not aware of.

SparkQA · 2018-10-01T23:16:13Z

Test build #96830 has finished for PR 22604 at commit 8c162a7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

ankuriitg · 2018-10-01T23:47:59Z

If I remove the following code, then the test succeeds and no spark jobs are started. Shall I do this instead?

`
sb.append("\n")
val nd = s"Null deviance: ${round(nullDeviance)} on $degreesOfFreedom degrees of freedom"
val rd = s"Residual deviance: ${round(deviance)} on $residualDegreeOfFreedom degrees of " +
"freedom"
val l = math.max(nd.length, rd.length)
sb.append(StringUtils.leftPad(nd, l))
sb.append("\n")
sb.append(StringUtils.leftPad(rd, l))

  if (family.name != "tweedie") {
    sb.append("\n")
    sb.append(s"AIC: " + round(aic))
  }

`

srowen · 2018-10-02T01:17:05Z

OK, backing up here, yes it's important to specify what the problem was that started this. It's exhibited in the error logs from your PR for SPARK-25118 (which is not submitted; might be worth clarifying your description).

	at org.apache.spark.rdd.RDD.map(RDD.scala:371)
	at org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary.nullDeviance$lzycompute(GeneralizedLinearRegression.scala:1342)
	at org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary.nullDeviance(GeneralizedLinearRegression.scala:1315)
	at org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary.toString(GeneralizedLinearRegression.scala:1556)
	at java.lang.String.valueOf(String.java:2994)
	at java.lang.StringBuilder.append(StringBuilder.java:131)
	at scala.StringContext.standardInterpolator(StringContext.scala:125)
	at scala.StringContext.s(StringContext.scala:95)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12$$anonfun$apply$6.apply(ClosureCleaner.scala:289)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12$$anonfun$apply$6.apply(ClosureCleaner.scala:289)
	at org.apache.spark.internal.Logging$class.logDebug(Logging.scala:62)
	at org.apache.spark.util.ClosureCleaner$.logDebug(ClosureCleaner.scala:35)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12.apply(ClosureCleaner.scala:289)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12.apply(ClosureCleaner.scala:289)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:289)
	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
	at org.apache.spark.SparkContext.clean(SparkContext.scala:2342)
	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:372)
	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:371)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:364)
	at org.apache.spark.rdd.RDD.map(RDD.scala:371)
	at org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary.nullDeviance$lzycompute(GeneralizedLinearRegression.scala:1342)
	at org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary.nullDeviance(GeneralizedLinearRegression.scala:1315)
	at org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary.toString(GeneralizedLinearRegression.scala:1556)
	at java.lang.String.valueOf(String.java:2994)
	at java.lang.StringBuilder.append(StringBuilder.java:131)
	at scala.StringContext.standardInterpolator(StringContext.scala:125)
	at scala.StringContext.s(StringContext.scala:95)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12$$anonfun$apply$6.apply(ClosureCleaner.scala:289)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12$$anonfun$apply$6.apply(ClosureCleaner.scala:289)
	at org.apache.spark.internal.Logging$class.logDebug(Logging.scala:62)
	at org.apache.spark.util.ClosureCleaner$.logDebug(ClosureCleaner.scala:35)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12.apply(ClosureCleaner.scala:289)
	at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$12.apply(ClosureCleaner.scala:289)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:289)
	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)

Yes, inside toString, computing nullDeviance and deviance runs a Spark job. The problem occurs at debug log level, because in the ClosureCleaner:

      if (log.isDebugEnabled) {
        ...
        logDebug(s" + outer objects: ${outerObjects.size}")
        outerObjects.foreach { o => logDebug(s"     $o") }
      }

and that calls toString again and thus the problem.

While I don't think it's ideal to call distributed jobs in toString, they're only computed once lazily, so that's reasonable actually.

While we can remove the nullDeviance etc, that's useful info. But I wonder whether there are similar problems lurking just like this. That said, this code has existed since about Spark 1.4 without issue, I guess.

But then, the value of this debug statement goes away almost entirely, I think, in Scala 2.12, where closures aren't implemented with outer classes.

I think I favor just removing this last debug statement in ClosureCleaner, or at best logging their classes only. This seems like less loss and behavior change than removing useful info from the toString.

Any other thoughts out there -- anyone care a lot about the debug info from the closure cleaner about outer class(es)?

ankuriitg · 2018-10-02T02:29:02Z

Makes sense and let me update the description about SPARK-25118

ankuriitg · 2018-10-02T18:51:43Z

I have opened another PR for this, with the recommended changes: #22616

I had to change a few other logDebug statements as well. Please let me know if that works.

srowen · 2018-10-02T20:25:38Z

OK, this PR should be closed then.

ankuriitg changed the title ~~[SPARK-25586][MLlib][Core] Replace toString method with a summarize m…~~ [SPARK-25586][MLlib][Core] Replace toString method with summarize m… Oct 1, 2018

ankuriitg closed this Oct 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-25586][MLlib][Core] Replace toString method with summarize m… #22604

[SPARK-25586][MLlib][Core] Replace toString method with summarize m… #22604

Uh oh!

ankuriitg commented Oct 1, 2018 •

edited

Loading

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

squito commented Oct 1, 2018

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

srowen commented Oct 1, 2018

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

SparkQA commented Oct 1, 2018

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

srowen commented Oct 2, 2018

Uh oh!

ankuriitg commented Oct 2, 2018

Uh oh!

ankuriitg commented Oct 2, 2018

Uh oh!

srowen commented Oct 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-25586][MLlib][Core] Replace toString method with summarize m… #22604

[SPARK-25586][MLlib][Core] Replace toString method with summarize m… #22604

Uh oh!

Conversation

ankuriitg commented Oct 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

squito commented Oct 1, 2018

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

srowen commented Oct 1, 2018

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

SparkQA commented Oct 1, 2018

Uh oh!

ankuriitg commented Oct 1, 2018

Uh oh!

srowen commented Oct 2, 2018

Uh oh!

ankuriitg commented Oct 2, 2018

Uh oh!

ankuriitg commented Oct 2, 2018

Uh oh!

srowen commented Oct 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ankuriitg commented Oct 1, 2018 •

edited

Loading