Skip to content

Conversation

@holdenk
Copy link
Contributor

@holdenk holdenk commented Oct 22, 2015

Add PMMLExportable to ML & use it for KMeans in ML pipeline. This is a bit different than the PMMLExportable in MLLib since the default trait doesn't go through a factory but instead depends on the model implementing it itself (and at the start that will mostly just be calling the parent model's method).
Still a WIP but wanted to see if anyone had comments about the slight change.

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44104 has finished for PR 9207 at commit 368879f.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait PMMLExportable\n

@holdenk holdenk force-pushed the SPARK-11171-SPARK-11237-Add-PMML-export-for-ML-KMeans branch from 368879f to 1749aec Compare October 22, 2015 00:13
@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44105 has finished for PR 9207 at commit 1749aec.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait PMMLExportable\n

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44111 has finished for PR 9207 at commit bc1b508.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait PMMLExportable\n

… only (since actual PMML model evaluation is to be done through a spark-packages project and previous JIRA decided re-loading PMML is out of project scope)
@holdenk holdenk changed the title [SPARK-11171][SPARK-11237][SPARK-11241][ML][WIP] Try adding PMMLExportable to ML with KMeans [SPARK-11171][SPARK-11237][SPARK-11241][ML] Try adding PMMLExportable to ML with KMeans Oct 23, 2015
@holdenk
Copy link
Contributor Author

holdenk commented Oct 23, 2015

cc @jkbradley

@SparkQA
Copy link

SparkQA commented Oct 23, 2015

Test build #44241 has finished for PR 9207 at commit adf0b36.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait PMMLExportable\n

@holdenk
Copy link
Contributor Author

holdenk commented Oct 29, 2015

re-ping @jkbradley

1 similar comment
@holdenk
Copy link
Contributor Author

holdenk commented Nov 8, 2015

re-ping @jkbradley

@holdenk
Copy link
Contributor Author

holdenk commented Nov 8, 2015

Also maybe @dbtsai if you have a chance to look at this that would be great.

@holdenk
Copy link
Contributor Author

holdenk commented Dec 1, 2015

So I've updated this against master @jkbradley

@SparkQA
Copy link

SparkQA commented Dec 1, 2015

Test build #46990 has finished for PR 9207 at commit 494ecbf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is a copy-paste of org.apache.spark.mllib.pmml, should we deprecate the mllib one, and use the new one in ml package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that might be good, the main difference is this avoids using the factory implementation that the MLLib API was.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Uses the same public facing API as per the JIRA discussion re: lack of complaints from users with old API)

@holdenk
Copy link
Contributor Author

holdenk commented Dec 16, 2015

ping @jkbradley

@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48499 has finished for PR 9207 at commit 41611b8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 11, 2016

Test build #49179 has finished for PR 9207 at commit 9525283.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Jan 12, 2016

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jan 12, 2016

Test build #49214 has finished for PR 9207 at commit 461c1ce.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 21, 2016

Test build #49819 has finished for PR 9207 at commit b514421.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Jan 21, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jan 21, 2016

Test build #49837 has finished for PR 9207 at commit b514421.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2016

Test build #61683 has finished for PR 9207 at commit 49f8a8d.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2016

Test build #61684 has finished for PR 9207 at commit e6845f1.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 5, 2016

Test build #61784 has finished for PR 9207 at commit 9170b3f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Jul 5, 2016

HiveSubmit failures seem unrelated, jenkins retest this please.

@SparkQA
Copy link

SparkQA commented Jul 6, 2016

Test build #61799 has finished for PR 9207 at commit 9170b3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 18, 2016

Test build #62478 has finished for PR 9207 at commit 8579c1b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Jul 25, 2016

Now that 2.0 RC5 has passed - maybe it would be an OK time to revisit this? I think it can also be an ok base for adding more formats if we need to.

@SparkQA
Copy link

SparkQA commented Aug 1, 2016

Test build #63086 has finished for PR 9207 at commit 8103b76.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 4, 2016

Test build #63193 has finished for PR 9207 at commit bdcfbd1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Least(children: Seq[Expression]) extends Expression
    • case class Greatest(children: Seq[Expression]) extends Expression
    • implicit class SchemaAttribute(f: StructField)

@SparkQA
Copy link

SparkQA commented Aug 6, 2016

Test build #63295 has finished for PR 9207 at commit 00173aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class ShuffleIndexInformation
    • public class ShuffleIndexRecord
    • case class CreateTable(tableDesc: CatalogTable, mode: SaveMode, query: Option[LogicalPlan])
    • case class PreprocessDDL(conf: SQLConf) extends Rule[LogicalPlan]

@holdenk
Copy link
Contributor Author

holdenk commented Aug 8, 2016

ping @MLnick now that 2.0 is out for awhile.

@SparkQA
Copy link

SparkQA commented Sep 9, 2016

Test build #65123 has finished for PR 9207 at commit 0e8c523.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Nov 16, 2016

Ping @jkbradley / @MLnick - we are coming up on a year since we told people we would add PMML export to Spark in about a year so I figured I'd try and update this again. I'll update this PR tonight against the latest master.

@SparkQA
Copy link

SparkQA commented Nov 16, 2016

Test build #68724 has finished for PR 9207 at commit 9cb8994.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 21, 2017

@MLnick: Do you have the bandwith to revisit this? I'm open to refactoring to a more plug-gable approach if we've got the review bandwidth for it.

@SparkQA
Copy link

SparkQA commented Sep 23, 2017

Test build #82102 has finished for PR 9207 at commit 9cb8994.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Mar 5, 2018

Closing this in favor of the approach taken in #19876

@holdenk holdenk closed this Mar 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants