Skip to content

Conversation

@jkbradley
Copy link
Member

Pipeline and PipelineModel extend Readable and Writable. Persistence succeeds only when all stages are Writable.

Note: This PR reinstates tests for other read/write functionality. It should probably not get merged until [https://issues.apache.org/jira/browse/SPARK-11672] gets fixed.

CC: @mengxr

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45764 has finished for PR 9674 at commit 3700091.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class Pipeline(override val uid: String) extends Estimator[PipelineModel] with Writable\n

@mengxr
Copy link
Contributor

mengxr commented Nov 13, 2015

test this please

@SparkQA
Copy link

SparkQA commented Nov 13, 2015

Test build #45897 has finished for PR 9674 at commit caf57c2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class Pipeline(override val uid: String) extends Estimator[PipelineModel] with Writable\n

@mengxr
Copy link
Contributor

mengxr commented Nov 16, 2015

test this please

@SparkQA
Copy link

SparkQA commented Nov 16, 2015

Test build #46012 has finished for PR 9674 at commit caf57c2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class Pipeline(override val uid: String) extends Estimator[PipelineModel] with Writable\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should users be able to save an incomplete pipeline? For example, I could make a template pipeline, send it to other users, and they only need to fill in some required params like inputCol after they load it back.

@mengxr
Copy link
Contributor

mengxr commented Nov 16, 2015

One suggestion is to merge PipelineShardWriter and PipelineSharedReader into a single object under object Pipeline, e.g., called SharedReadWrite. Then move PipelineReader, PipelineWriter to object Pipeline, and PipelineModelReader and PipelineModelWriter to object PipelineModel. The main purpose is to not pollute the package space in Java. Otherwise, they are all visible under org.apache.spark.ml in Java.

@jkbradley
Copy link
Member Author

@mengxr Thanks for reviewing! I believe I addressed everything, except where I quibbled in responses above.

@mengxr
Copy link
Contributor

mengxr commented Nov 16, 2015

LGTM pending Jenkins.

@SparkQA
Copy link

SparkQA commented Nov 17, 2015

Test build #46026 has finished for PR 9674 at commit f791010.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class Pipeline(override val uid: String) extends Estimator[PipelineModel] with Writable\n

@jkbradley
Copy link
Member Author

@mengxr Thank you for reviewing! Merging with master and branch-1.6

asfgit pushed a commit that referenced this pull request Nov 17, 2015
Pipeline and PipelineModel extend Readable and Writable.  Persistence succeeds only when all stages are Writable.

Note: This PR reinstates tests for other read/write functionality.  It should probably not get merged until [https://issues.apache.org/jira/browse/SPARK-11672] gets fixed.

CC: mengxr

Author: Joseph K. Bradley <[email protected]>

Closes #9674 from jkbradley/pipeline-io.

(cherry picked from commit 1c5475f)
Signed-off-by: Joseph K. Bradley <[email protected]>
@asfgit asfgit closed this in 1c5475f Nov 17, 2015
@jkbradley jkbradley deleted the pipeline-io branch November 17, 2015 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants