-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-51132][ML][BUILD] Upgrade JPMML to 1.7.1
#49854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,6 +26,25 @@ Note that this migration guide describes the items specific to MLlib. | |
| Many items of SQL migration can be applied when migrating MLlib to higher versions for DataFrame-based APIs. | ||
| Please refer [Migration Guide: SQL, Datasets and DataFrame](sql-migration-guide.html). | ||
|
|
||
| ## Upgrading from MLlib 3.5 to 4.0 | ||
|
|
||
| ### Breaking changes | ||
| {:.no_toc} | ||
|
|
||
| There are no breaking changes. | ||
|
|
||
| ### Deprecations and changes of behavior | ||
| {:.no_toc} | ||
|
|
||
| **Deprecations** | ||
|
|
||
| There are no deprecations. | ||
|
|
||
| **Changes of behavior** | ||
|
|
||
| * [SPARK-51132](https://issues.apache.org/jira/browse/SPARK-51132): | ||
| The PMML XML schema version of exported PMML format models by [PMML model export](mllib-pmml-model-export.html) has been upgraded from `PMML-4_3` to `PMML-4_4`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's just one question: What's the difference between
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Since you didn't change anything about your application code, then the only perceivable difference will be a new XML namespace declaration on the first line of the exported PMML documents. Previously, the top-level PMML element was
The JPMML-Model library defaults to the latest PMML schema version in its output (ie. If you really want to, you can keep outputting PMML 4.3 schema version documents by "filtering" the output stream using the JAXBSerializer jaxbSerializer = new JAXBSerializer();
OutputStream os = ...
try(OutputStream os = new PMMLOutputStream(os, Version.PMML_4_3)){
jaxbSerializer.serializePretty(pmml, os);
} |
||
|
|
||
| ## Upgrading from MLlib 2.4 to 3.0 | ||
|
|
||
| ### Breaking changes | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,8 +19,8 @@ package org.apache.spark.mllib.pmml.`export` | |
|
|
||
| import scala.{Array => SArray} | ||
|
|
||
| import org.dmg.pmml.{DataDictionary, DataField, DataType, FieldName, MiningField, | ||
| MiningFunction, MiningSchema, OpType} | ||
| import org.dmg.pmml.{DataDictionary, DataField, DataType, MiningField, MiningFunction, | ||
| MiningSchema, OpType} | ||
| import org.dmg.pmml.regression.{NumericPredictor, RegressionModel, RegressionTable} | ||
|
|
||
| import org.apache.spark.mllib.regression.GeneralizedLinearModel | ||
|
|
@@ -44,7 +44,7 @@ private[mllib] class BinaryClassificationPMMLModelExport( | |
| pmml.getHeader.setDescription(description) | ||
|
|
||
| if (model.weights.size > 0) { | ||
| val fields = new SArray[FieldName](model.weights.size) | ||
| val fields = new SArray[String](model.weights.size) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changes come from: jpmml/jpmml-model@5969dc2 |
||
| val dataDictionary = new DataDictionary | ||
| val miningSchema = new MiningSchema | ||
| val regressionTableYES = new RegressionTable(model.intercept).setTargetCategory("1") | ||
|
|
@@ -67,7 +67,7 @@ private[mllib] class BinaryClassificationPMMLModelExport( | |
| .addRegressionTables(regressionTableYES, regressionTableNO) | ||
|
|
||
| for (i <- 0 until model.weights.size) { | ||
| fields(i) = FieldName.create("field_" + i) | ||
| fields(i) = "field_" + i | ||
| dataDictionary.addDataFields(new DataField(fields(i), OpType.CONTINUOUS, DataType.DOUBLE)) | ||
| miningSchema | ||
| .addMiningFields(new MiningField(fields(i)) | ||
|
|
@@ -76,7 +76,7 @@ private[mllib] class BinaryClassificationPMMLModelExport( | |
| } | ||
|
|
||
| // add target field | ||
| val targetField = FieldName.create("target") | ||
| val targetField = "target" | ||
| dataDictionary | ||
| .addDataFields(new DataField(targetField, OpType.CATEGORICAL, DataType.STRING)) | ||
| miningSchema | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,7 +23,7 @@ import java.util.Locale | |
|
|
||
| import scala.beans.BeanProperty | ||
|
|
||
| import org.dmg.pmml.{Application, Header, PMML, Timestamp} | ||
| import org.dmg.pmml.{Application, Header, PMML, Timestamp, Version} | ||
|
|
||
| private[mllib] trait PMMLModelExport { | ||
|
|
||
|
|
@@ -44,6 +44,6 @@ private[mllib] trait PMMLModelExport { | |
| val header = new Header() | ||
| .setApplication(app) | ||
| .setTimestamp(timestamp) | ||
| new PMML("4.2", header, null) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact, in JPMML 1.4.7 version, the PMML standard was updated to 4.3, so it should be consistent and changed to 4.3 at that time |
||
| new PMML(Version.PMML_4_4.getVersion(), header, null) | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1165,7 +1165,7 @@ class LinearRegressionSuite extends MLTest with DefaultReadWriteTest with PMMLRe | |
| assert(fields(0).getOpType() == OpType.CONTINUOUS) | ||
| val pmmlRegressionModel = pmml.getModels().get(0).asInstanceOf[PMMLRegressionModel] | ||
| val pmmlPredictors = pmmlRegressionModel.getRegressionTables.get(0).getNumericPredictors | ||
| val pmmlWeights = pmmlPredictors.asScala.map(_.getCoefficient()).toList | ||
| val pmmlWeights = pmmlPredictors.asScala.map(_.getCoefficient().doubleValue()).toList | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changes come from: jpmml/jpmml-model@6d356fe |
||
| assert(pmmlWeights(0) ~== model.coefficients(0) relTol 1E-3) | ||
| assert(pmmlWeights(1) ~== model.coefficients(1) relTol 1E-3) | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -571,7 +571,7 @@ | |
| <dependency> | ||
| <groupId>org.jpmml</groupId> | ||
| <artifactId>pmml-model</artifactId> | ||
| <version>1.4.8</version> | ||
| <version>1.7.1</version> | ||
| <scope>provided</scope> | ||
| <exclusions> | ||
| <exclusion> | ||
|
|
@@ -599,32 +599,24 @@ | |
| <dependency> | ||
| <groupId>org.glassfish.jaxb</groupId> | ||
| <artifactId>jaxb-runtime</artifactId> | ||
| <version>2.3.2</version> | ||
| <version>4.0.5</version> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there version contract between this dep and other
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is currently no mutual dependence between the But they both depend on |
||
| <scope>compile</scope> | ||
| <exclusions> | ||
| <!-- for now, we only write XML in PMML export, and these can be excluded --> | ||
| <exclusion> | ||
| <groupId>com.sun.xml.fastinfoset</groupId> | ||
| <artifactId>FastInfoset</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.glassfish.jaxb</groupId> | ||
| <artifactId>txw2</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jvnet.staxex</groupId> | ||
| <artifactId>stax-ex</artifactId> | ||
| </exclusion> | ||
| <!-- | ||
| SPARK-27611: Exclude redundant javax.activation implementation, which | ||
| conflicts with the existing javax.activation:activation:1.1.1 dependency. | ||
| --> | ||
| <exclusion> | ||
| <groupId>jakarta.activation</groupId> | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the current
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. some exclusions are invalid now, please remove them
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I found that the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for pointing this out, I updated it |
||
| <artifactId>jakarta.activation-api</artifactId> | ||
| <groupId>org.eclipse.angus</groupId> | ||
| <artifactId>angus-activation</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>jakarta.xml.bind</groupId> | ||
| <artifactId>jakarta.xml.bind-api</artifactId> | ||
| <version>4.0.2</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.commons</groupId> | ||
| <artifactId>commons-lang3</artifactId> | ||
|
|
@@ -1061,13 +1053,6 @@ | |
| <groupId>org.glassfish.jersey.core</groupId> | ||
| <artifactId>jersey-server</artifactId> | ||
| <version>${jersey.version}</version> | ||
| <!-- SPARK-28765 Unused JDK11-specific dependency --> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>jakarta.xml.bind</groupId> | ||
| <artifactId>jakarta.xml.bind-api</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.glassfish.jersey.core</groupId> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.