From e6d424dd633cafa57d172375d9f36242d2250e00 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Wed, 2 Sep 2020 18:08:53 -0500 Subject: [PATCH 01/26] incorporate suggestions --- rfcs/20200902-pickle-for-keras/README.md | 54 +++++++++++++++++------- 1 file changed, 38 insertions(+), 16 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras/README.md index 847ce227e..edde14bbd 100644 --- a/rfcs/20200902-pickle-for-keras/README.md +++ b/rfcs/20200902-pickle-for-keras/README.md @@ -2,8 +2,8 @@ | Status | (Proposed / Accepted / Implemented / Obsolete) | :-------------- |:---------------------------------------------------- | -| **RFC #** | [286](https://github.com/tensorflow/community/pull/286) -| **Author(s)** | Adrian Garcia Badaracco ({firstname}@{firstname}gb.com) +| **RFC #** | [286](https://github.com/tensorflow/community/pull/286) | +| **Author(s)** | Adrian Garcia Badaracco ({firstname}@{firstname}gb.com), Scott Sievert (tf-rfc@stsievert.com) | | **Sponsor** | Mihai Maruseac (mihaimaruseac@google.com) | | **Updated** | 2020-09-02 | @@ -13,18 +13,33 @@ Implement support for Python's Pickle protocol within Keras. ## Motivation -The pickle protocol is used extensively -within the Python ecosystem, including by [Dask](https://github.com/dask/dask), -[Scikit-Learn](https://github.com/scikit-learn/scikit-learn) and several other -popular machine learning libraries. These libraries rely on the pickle protocol and -cannot work without it. This hinders what would otherwise be great uses of Keras. +> *Why this is a valuable problem to solve? What background information is needed +to show how this design addresses the problem?* +The specific motivation +for this RFC comes from supporting Keras models in Dask-ML's and Ray's hyperparameter +optimization. More generaly, support for serialization with the Pickle protocol will enable: -Pickle and `copy` (referring to the specific Python module, it can use -the pickle protocol as it's backend) are also the _only_ universal way to -save Python objects. This means that this is what most users try first. -As evidenced by several of the StackOverflow/TensorFlow issues below, even if there are -TensorFlow specific ways to copy things in memory or disk, users will probably -try pickle or copy first and be confused by cryptic errors. +* Using Keras with distributed systems (e.g, Python's `multiprocessing`, Dask, Ray or IPython parallel). +* Copying Keras models with Python's built-in `copy.deepcopy`. + +This request is *not* advocating for use of Pickle while saving or sharing +Keras models. We believe the efficient, secure and stable methods in TF should be +used for that. We are proposing to add a Pickle implementation that uses the same efficient method. +This will enable wider usage in the Python ecosystem. + +> *Which users are affected by the problem? Why is it a problem? What data supports +this? What related work exists?* +Users trying to use distributed systems (e.g, Ray or Dask) with Keras are +affected. In our experience, this is common in hyperparameter optimization. + +Related work is in [SciKeras], which brings a Scikit-Learn API +to Keras. Scikit-Learn estimators must be able to be pickled ([source][skp]). As such, +SciKeras has an implementation of `__reduce_ex__`, which is also in +[tensorflow#39609]. + +[tensorflow#39609]:https://github.com/tensorflow/tensorflow/pull/39609 +[SciKeras]:https://github.com/adriangb/scikeras +[skp]:https://github.com/scikit-learn/scikit-learn/blob/0fb307bf39bbdacd6ed713c00724f8f871d60370/sklearn/utils/estimator_checks.py#L1523-L1524 Here are several examples of Keras users running into issues because pickle is not supported. @@ -52,9 +67,16 @@ Here is an post describing how PyTorch ran into similar issues and how they reso ## User Benefit -* Lessen the learning curve for new Keras/TF users since they will be able to -use entry points they already know. -* Improve compatibility with libraries like Scikit-Learn and Dask. +> How will users (or other contributors) benefit from this work? What would be the headline in the release notes or blog post? + +One blog post headline: "Keras models can be used in advanced hyperparameter optimization with Ray Tune or Dask-ML." + +Users will notice: + +* Improved compatibility with libraries like Scikit-Learn and Dask. +* Smaller learning curve for new Keras/TF users since they will be able to +use entry points they already know and simultaneously access TFs high performance +internal methods. ## Design Proposal From 2345c40ce53b739205d67a4df482fe077de3af43 Mon Sep 17 00:00:00 2001 From: Scott Sievert Date: Wed, 2 Sep 2020 20:06:59 -0500 Subject: [PATCH 02/26] Update README.md --- rfcs/20200902-pickle-for-keras/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras/README.md index edde14bbd..42864ad87 100644 --- a/rfcs/20200902-pickle-for-keras/README.md +++ b/rfcs/20200902-pickle-for-keras/README.md @@ -15,6 +15,7 @@ Implement support for Python's Pickle protocol within Keras. > *Why this is a valuable problem to solve? What background information is needed to show how this design addresses the problem?* + The specific motivation for this RFC comes from supporting Keras models in Dask-ML's and Ray's hyperparameter optimization. More generaly, support for serialization with the Pickle protocol will enable: @@ -29,6 +30,7 @@ This will enable wider usage in the Python ecosystem. > *Which users are affected by the problem? Why is it a problem? What data supports this? What related work exists?* + Users trying to use distributed systems (e.g, Ray or Dask) with Keras are affected. In our experience, this is common in hyperparameter optimization. @@ -195,4 +197,4 @@ users to use `Model.save` when they are trying to save their model. ## Questions and Discussion Topics -Seed this with open questions you require feedback on from the RFC process. \ No newline at end of file +Seed this with open questions you require feedback on from the RFC process. From cb48f7f32089d80cf27a4c05c3351751442699f0 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Wed, 2 Sep 2020 23:17:15 -0500 Subject: [PATCH 03/26] add note aboute SaveModel --- rfcs/20200902-pickle-for-keras/README.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras/README.md index 42864ad87..3bb376276 100644 --- a/rfcs/20200902-pickle-for-keras/README.md +++ b/rfcs/20200902-pickle-for-keras/README.md @@ -130,6 +130,8 @@ class Model: ``` Note how this exactly mirrors the aforementioned blog post regarding PyTorch ([link](https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle)). +This would however require extensive work to refactor the entire SaveModel ecosystem to allow the user to specify +a file-like object instead of only a folder name. By implementing this in all of Keras' base classes, things will automatically work with custom metrics and subclassed models. @@ -182,18 +184,18 @@ end to end implementations and tests for all of this. * Does the design conform to the backwards & forwards compatibility [requirements](https://www.tensorflow.org/programmers_guide/version_compat)? **YES** * How will this proposal interact with other parts of the TensorFlow Ecosystem? - - How will it work with TFLite? *N/A* - - How will it work with distribution strategies? *N/A* - - How will it interact with tf.function? *N/A* - - Will this work on GPU/TPU? *N/A* - - How will it serialize to a SavedModel? *Circular question...* + * How will it work with TFLite? *N/A* + * How will it work with distribution strategies? *N/A* + * How will it interact with tf.function? *N/A* + * Will this work on GPU/TPU? *N/A* + * How will it serialize to a SavedModel? *Circular question...* ### User Impact * What are the user-facing changes? How will this feature be rolled out? -We implement it and are done. I do not think there is any need to edit the docs to advertise this feature. We still want -users to use `Model.save` when they are trying to save their model. +We just have to implement this. I do not think there is any need to edit the docs to advertise this feature. +We still want users to use `Model.save` when they are trying to save their model. ## Questions and Discussion Topics From fc3df6a216099680714b2624dd9fca6431ecbc4d Mon Sep 17 00:00:00 2001 From: Scott Date: Thu, 3 Sep 2020 12:05:27 -0500 Subject: [PATCH 04/26] Edit motivation --- rfcs/20200902-pickle-for-keras/README.md | 194 +++++++++++++++-------- 1 file changed, 129 insertions(+), 65 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras/README.md index 3bb376276..6bd13396f 100644 --- a/rfcs/20200902-pickle-for-keras/README.md +++ b/rfcs/20200902-pickle-for-keras/README.md @@ -16,69 +16,97 @@ Implement support for Python's Pickle protocol within Keras. > *Why this is a valuable problem to solve? What background information is needed to show how this design addresses the problem?* -The specific motivation -for this RFC comes from supporting Keras models in Dask-ML's and Ray's hyperparameter -optimization. More generaly, support for serialization with the Pickle protocol will enable: - -* Using Keras with distributed systems (e.g, Python's `multiprocessing`, Dask, Ray or IPython parallel). +The specific motivation for this RFC comes from supporting Keras models in +Dask-ML's and Ray's hyperparameter optimization. More generaly, support for +serialization with the Pickle protocol will enable: + +* Using Keras with distributed systems and custom parallelization + strategies/libraries (e.g, Python's `multiprocessing`, Dask, Ray or IPython + parallel). +* Saving Keras models to disk with custom serialization libraries (Joblib, + Dill, etc), which is most common when using a Keras model as part of a + Scikit-Learn pipeline or with their hyperparameter searches. * Copying Keras models with Python's built-in `copy.deepcopy`. +Supporting Pickle will enable wider usage in the Python ecosystem. Ecosystems +of libraries depend strongly on the presence of protocols, and a strong +consensus around implementing them consistently and efficiently. See "[Pickle +isn't slow, it's a protocol]" for more detail. Notably, this post focuses on +having an efficent Pickle implementation for PyTorch. + This request is *not* advocating for use of Pickle while saving or sharing -Keras models. We believe the efficient, secure and stable methods in TF should be -used for that. We are proposing to add a Pickle implementation that uses the same efficient method. -This will enable wider usage in the Python ecosystem. +Keras models. We believe the efficient, secure and stable methods in TF should +be used for that. We are proposing to add a Pickle implementation that uses the +same efficient method. + +[Pickle isn't slow, it's a protocol]:https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle > *Which users are affected by the problem? Why is it a problem? What data supports this? What related work exists?* Users trying to use distributed systems (e.g, Ray or Dask) with Keras are -affected. In our experience, this is common in hyperparameter optimization. +affected. In our experience, this is common in hyperparameter optimization. In +general, having Pickle support means a better experience, especially when using +Keras with other libraries. Briefly, this when using a Keras model in +a Scikit-Learn pipeline or when using a custom parallelization like Joblib or Dask. +Detail is give in "User Benefit." + +[dask-ml#534]:https://github.com/dask/dask-ml/issues/534 +[SO#51110834]:https://stackoverflow.com/questions/51110834/cannot-pickle-dill-a-keras-object +[SO#54070845]:https://stackoverflow.com/questions/54070845/how-to-pickle-keras-custom-layer +[SO#59872509]:https://stackoverflow.com/questions/59872509/how-to-export-a-model-created-from-kerasclassifier-and-gridsearchcv-using-joblib +[SO#37984304]:https://stackoverflow.com/questions/37984304/how-to-save-a-scikit-learn-pipline-with-keras-regressor-inside-to-disk +[SO#48295661]:https://stackoverflow.com/questions/48295661/how-to-pickle-keras-model +[skper]:https://scikit-learn.org/stable/modules/model_persistence.html#persistence-example +[TF#33204]:https://github.com/tensorflow/tensorflow/issues/33204 +[TF#34697]:https://github.com/tensorflow/tensorflow/issues/34697 + Related work is in [SciKeras], which brings a Scikit-Learn API -to Keras. Scikit-Learn estimators must be able to be pickled ([source][skp]). As such, -SciKeras has an implementation of `__reduce_ex__`, which is also in +to Keras. Scikit-Learn estimators must be able to be pickled ([source][skp]). +As such, SciKeras has an implementation of `__reduce_ex__`, which is also in [tensorflow#39609]. [tensorflow#39609]:https://github.com/tensorflow/tensorflow/pull/39609 [SciKeras]:https://github.com/adriangb/scikeras [skp]:https://github.com/scikit-learn/scikit-learn/blob/0fb307bf39bbdacd6ed713c00724f8f871d60370/sklearn/utils/estimator_checks.py#L1523-L1524 -Here are several examples -of Keras users running into issues because pickle is not supported. - -GH issues where `Model.save` would not work: - -* [TF#34697](https://github.com/tensorflow/tensorflow/issues/34697) -* [TF#33204](https://github.com/tensorflow/tensorflow/issues/33204) -* TODO: issues from Dask - + -Here is an post describing how PyTorch ran into similar issues and how they resolved them: [link](https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle) ## User Benefit > How will users (or other contributors) benefit from this work? What would be the headline in the release notes or blog post? -One blog post headline: "Keras models can be used in advanced hyperparameter optimization with Ray Tune or Dask-ML." +One blog post headline: "Keras models can be used in advanced hyperparameter +optimization with Ray Tune or Dask-ML." -Users will notice: +Users will also benefit with easier usage; they won't run into any of these +errors: -* Improved compatibility with libraries like Scikit-Learn and Dask. -* Smaller learning curve for new Keras/TF users since they will be able to -use entry points they already know and simultaneously access TFs high performance -internal methods. +* Scikit-Learn recommends using Joblib for model persistence ([source][skper]). + That means people try to save Scikit-Learn estimators to disk with Joblib. + This fails because Keras models can not be serialized without a custom + method. Examples of failures include [SO#59872509], [SO#37984304] and + [SO#48295661], and [SO#51110834] uses a different serialization library (dill). +* Using custom parallelization strategies requires serialization support through + Pickle; however, many parallelization libraries like Joblib and Dask don't + special case Keras models. Relevant errors are most common in hyperparameter + optimization with Scikit-Learn's parallelization through Joblib + ([TF#33204] and [TF#34697]) or parallelization through Dask ([dask-ml#534]). +* Lack of Pickle support can complicate saving training history like in + (the poorly asked) [SO#54070845]. + + +This RFC would resolve these issues. ## Design Proposal @@ -87,19 +115,22 @@ The pickle protocol supports two distinct functions: 1. In-memory copying of live objects: via Python's `copy` module. This falls back to (2) below. 2. Serialization to arbitrary IO (memory or disk): via Python's `pickle` module. -This proposal seeks to take the conservative approach at least initially and only -implement (2) above since (1) can always fall back to (2) and using only (2) alleviates -any concerns around references to freed memory in the C++ portions of TF and other such bugs. +This proposal seeks to take the conservative approach at least initially and +only implement (2) above since (1) can always fall back to (2) and using only +(2) alleviates any concerns around references to freed memory in the C++ +portions of TF and other such bugs. This said, for situations where the user is making an in-memory copy of an object and it might even be okay to keep around references to non-Python objects, a separate approach that optimizes -(1) would be warranted. This RFC does not seek to address this problem. Hence this RFC is generally -not concerned with: +(1) would be warranted. This RFC does not seek to address this problem. Hence +this RFC is generally not concerned with: -* Issues arising from C++ references. These cannot be kept around when serializing to a binary file stream. +* Issues arising from C++ references. These cannot be kept around when + serializing to a binary file stream. * Performance of the serialization/deserialization. -The general proposal is to implement the pickle protocol using existing Keras saving functionality +The general proposal is to implement the pickle protocol using existing Keras +saving functionality as a backend. For example, adding pickle/copy support to Metrics is as simple as: ```python3 @@ -115,8 +146,13 @@ the pickle protocol can be found [here](https://docs.python.org/3/library/pickle For more complex objects (namely `tf.keras.Model`) we can either: -1. Implement a similar approach, but we would need to save the weights and config separately. See [this notebook](https://colab.research.google.com/drive/14ECRN8ZQDa1McKri2dctlV_CaPkE574I?authuser=1#scrollTo=qlXDfJObNXVf) for an example. -2. Use `Model.save` as the backend. This would require implementing support for serializing to memory in `Model.save`, but is overall a better solution (since `Model.save` is the official way to save models in `tf.keras`) +1. Implement a similar approach, but we would need to save the weights and + config separately. See [this + notebook](https://colab.research.google.com/drive/14ECRN8ZQDa1McKri2dctlV_CaPkE574I?authuser=1#scrollTo=qlXDfJObNXVf) + for an example. +2. Use `Model.save` as the backend. This would require implementing support for + serializing to memory in `Model.save`, but is overall a better solution + (since `Model.save` is the official way to save models in `tf.keras`) Solution (2) would look something like this (assuming `Model.save` worked with `io.BytesIO()`): @@ -129,9 +165,11 @@ class Model: return load_model, (b.get_value()),) # where load_model is tf.keras.models.load_model ``` -Note how this exactly mirrors the aforementioned blog post regarding PyTorch ([link](https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle)). -This would however require extensive work to refactor the entire SaveModel ecosystem to allow the user to specify -a file-like object instead of only a folder name. +Note how this exactly mirrors the aforementioned blog post regarding PyTorch +([link](https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle)). +This would however require extensive work to refactor the entire SaveModel +ecosystem to allow the user to specify a file-like object instead of only a +folder name. By implementing this in all of Keras' base classes, things will automatically work with custom metrics and subclassed models. @@ -140,38 +178,57 @@ with custom metrics and subclassed models. The only real alternative is to: -1. Ask all libraries that currently use pickle to make a special case for each Keras object and figure out how each Keras object prefers to be serialized (see use of `serialize` vs `Model.save` above). +1. Ask all libraries that currently use pickle to make a special case for each + Keras object and figure out how each Keras object prefers to be serialized + (see use of `serialize` vs `Model.save` above). 2. Ask all users to learn the above as well. -3. Override `__reduce_ex__` to give a user friendly warning instead of failing cryptically. +3. Override `__reduce_ex__` to give a user friendly warning instead of failing + cryptically. ### Performance Implications -* The performance should be the same as the underlying backend that is already implemented in TF. -* For cases where the user was going to pickle anyway, this will be faster because it uses TF's methods instead of letting Python deal with it naively. -* Tests will consist of running `new_model = pickle.loads(pickle.loads(model))` and then doing checks on `new_model`. +* The performance should be the same as the underlying backend that is already + implemented in TF. +* For cases where the user was going to pickle anyway, this will be faster + because it uses TF's methods instead of letting Python deal with it naively. +* Tests will consist of running `new_model = pickle.loads(pickle.loads(model))` + and then doing checks on `new_model`. ### Dependencies * Dependencies: does this proposal add any new dependencies to TensorFlow? **NO** -* Dependent projects: are there other areas of TensorFlow or things that use TensorFlow (TFX/pipelines, TensorBoard, etc.) that this affects? **This should not affect anything** +* Dependent projects: are there other areas of TensorFlow or things that use + TensorFlow (TFX/pipelines, TensorBoard, etc.) that this affects? **This + should not affect anything** ### Engineering Impact -* Do you expect changes to binary size / startup time / build time / test times? **NO** -* Who will maintain this code? Is this code in its own buildable unit? Can this code be tested in its own? Is visibility suitably restricted to only a small API surface for others to use? +* Do you expect changes to binary size / startup time / build time / test + times? **NO** +* Who will maintain this code? Is this code in its own buildable unit? Can this + code be tested in its own? Is visibility suitably restricted to only a small + API surface for others to use? -This code depends on existing Keras/TF methods. As long as those are maintained and don't break, this code will not break. The new API surface area is very small. +This code depends on existing Keras/TF methods. As long as those are maintained +and don't break, this code will not break. The new API surface area is very +small. ### Platforms and Environments -* Platforms: does this work on all platforms supported by TensorFlow? If not, why is that ok? Will it work on embedded/mobile? Does it impact automatic code generation or mobile stripping tooling? Will it work with transformation tools? -* Execution environments (Cloud services, accelerator hardware): what impact do you expect and how will you confirm? +* Platforms: does this work on all platforms supported by TensorFlow? If not, + why is that ok? Will it work on embedded/mobile? Does it impact automatic + code generation or mobile stripping tooling? Will it work with transformation + tools? +* Execution environments (Cloud services, accelerator hardware): what impact do + you expect and how will you confirm? -This will work on anything that is running Python >= 2.7 (as far as I can tell, the pickle protocol has not changed since then). +This will work on anything that is running Python >= 2.7 (as far as I can tell, +the pickle protocol has not changed since then). ### Best Practices -* Does this proposal change best practices for some aspect of using/developing TensorFlow? How will these changes be communicated/enforced? **NO** +* Does this proposal change best practices for some aspect of using/developing + TensorFlow? How will these changes be communicated/enforced? **NO** ### Tutorials and Examples @@ -181,21 +238,28 @@ end to end implementations and tests for all of this. ### Compatibility -* Does the design conform to the backwards & forwards compatibility [requirements](https://www.tensorflow.org/programmers_guide/version_compat)? **YES** +* Does the design conform to the backwards & forwards compatibility + [requirements](https://www.tensorflow.org/programmers_guide/version_compat)? + **YES** * How will this proposal interact with other parts of the TensorFlow Ecosystem? - * How will it work with TFLite? *N/A* - * How will it work with distribution strategies? *N/A* - * How will it interact with tf.function? *N/A* - * Will this work on GPU/TPU? *N/A* - * How will it serialize to a SavedModel? *Circular question...* + * How will it work with TFLite? *N/A* + * How will it work with distribution strategies? + *N/A, unlesss the model is still being trained. Then, this enables custom + serialization methods, which might enable support for other distribution strategies.* + * How will it interact with tf.function? *N/A* + * Will this work on GPU/TPU? *N/A* + * How will it serialize to a SavedModel? *Not applicable, and almost a circular question.* ### User Impact * What are the user-facing changes? How will this feature be rolled out? -We just have to implement this. I do not think there is any need to edit the docs to advertise this feature. -We still want users to use `Model.save` when they are trying to save their model. +There are no user-facing changes: this is a backend change to private methods. + +Rolling out only involves testing. It will not require any documentation +changes to advertise this features: `Model.save` should still be used for users +simply trying to save their model to disk. ## Questions and Discussion Topics From d3b32055a5036bce2453f0c790a3dfd5c9a5f0f0 Mon Sep 17 00:00:00 2001 From: Scott Date: Thu, 3 Sep 2020 15:47:51 -0500 Subject: [PATCH 05/26] Technical edits --- rfcs/20200902-pickle-for-keras/README.md | 177 +++++++++++++---------- 1 file changed, 103 insertions(+), 74 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras/README.md index 6bd13396f..18a404e90 100644 --- a/rfcs/20200902-pickle-for-keras/README.md +++ b/rfcs/20200902-pickle-for-keras/README.md @@ -31,8 +31,13 @@ serialization with the Pickle protocol will enable: Supporting Pickle will enable wider usage in the Python ecosystem. Ecosystems of libraries depend strongly on the presence of protocols, and a strong consensus around implementing them consistently and efficiently. See "[Pickle -isn't slow, it's a protocol]" for more detail. Notably, this post focuses on -having an efficent Pickle implementation for PyTorch. +isn't slow, it's a protocol]" for more detail (notably, this post focuses on +having an efficent Pickle implementation for PyTorch). Without these protocols, +it's necessary for each library to implement a custom serialization method +(e.g, Dask Distributed has a custom serialization method at +[distributed/protocol/keras.py]) + +[distributed/protocol/keras.py]:https://github.com/dask/distributed/blob/73fa9bd1bd7dcb4ceed72cdbdc6dd4b92f887521/distributed/protocol/keras.py This request is *not* advocating for use of Pickle while saving or sharing Keras models. We believe the efficient, secure and stable methods in TF should @@ -110,80 +115,99 @@ This RFC would resolve these issues. ## Design Proposal -The pickle protocol supports two distinct functions: +The general proposal is to implement the pickle protocol using existing Keras +saving functionality as a backend. For example, adding pickle to TF Metrics +is as simple as the following: -1. In-memory copying of live objects: via Python's `copy` module. This falls back to (2) below. -2. Serialization to arbitrary IO (memory or disk): via Python's `pickle` module. +``` python +from tf.keras.metrics import Metric, serialize, deserialize -This proposal seeks to take the conservative approach at least initially and -only implement (2) above since (1) can always fall back to (2) and using only -(2) alleviates any concerns around references to freed memory in the C++ -portions of TF and other such bugs. - -This said, for situations where the user is making an in-memory copy of an object and it might -even be okay to keep around references to non-Python objects, a separate approach that optimizes -(1) would be warranted. This RFC does not seek to address this problem. Hence -this RFC is generally not concerned with: +class NewMetric(Metric): + def __reduce__(self, protocol): + return deserialize, (serialize(self),) +``` -* Issues arising from C++ references. These cannot be kept around when - serializing to a binary file stream. -* Performance of the serialization/deserialization. +This implementation adds support for the Pickle protocol, which supports serialization +to arbitrary IO, either memory or disk. The `__reduce__` special method can return +the string that would have been written to disk and the function to load that string into memory ([docs][reduce_docs]). +Now, the tests pass with `NewMetric`: -The general proposal is to implement the pickle protocol using existing Keras -saving functionality -as a backend. For example, adding pickle/copy support to Metrics is as simple as: +[reduce_docs]:https://docs.python.org/3/library/pickle.html#object.__reduce__ -```python3 -class Metric: # in tf.keras.metrics +``` python +import pickle +m1 = NewMetric() # TODO: is this correct? - def __reduce_ex__(self, protocol): - return deserialize, (serialize(metric),) # where deserialize is tf.keras.metrics.deserialize +m2 = pickle.loads(pickle.dumps(m1)) +assert m1 == m2 # TODO: or some other check ``` -Documentation for how to use `__reduce__ex__` and other alternatives that allow implementing of -the pickle protocol can be found [here](https://docs.python.org/3/library/pickle.html) and -[here](https://docs.python.org/3/library/copyreg.html) (official Python docs). +For more complex objects like `tf.keras.Model`, there are two options with `__reduce__`: + +1. Implement a similar approach that saves the weights and + config separately ([notebook example]). +2. Use `Model.save` as the backend. -For more complex objects (namely `tf.keras.Model`) we can either: +Option (1) is similar to Dask Distributed's custom serialization of Keras models in [distributed/protocol/keras.py]. Option 2 would require implementing support for +serializing to memory in `Model.save`. -1. Implement a similar approach, but we would need to save the weights and - config separately. See [this - notebook](https://colab.research.google.com/drive/14ECRN8ZQDa1McKri2dctlV_CaPkE574I?authuser=1#scrollTo=qlXDfJObNXVf) - for an example. -2. Use `Model.save` as the backend. This would require implementing support for - serializing to memory in `Model.save`, but is overall a better solution - (since `Model.save` is the official way to save models in `tf.keras`) +Option (2) is preferred because it seems to be a better solution +since `Model.save` is the recommended method for model persistence. +It would look something like the following, if `Model.save` worked +with `io.BytesIO()`: + +[notebook example]:https://colab.research.google.com/drive/14ECRN8ZQDa1McKri2dctlV_CaPkE574I?authuser=1#scrollTo=qlXDfJObNXVf -Solution (2) would look something like this (assuming `Model.save` worked with `io.BytesIO()`): -```python3 +``` python +# tensorflow/python/.../model.py TODO make sure correct +import io +from tf.keras.models import load_model + class Model: + ... - def __reduce_ex__(self, protocol): - b = io.BytesIO() # python built in library - self.save(b) # i.e. tf.keras.Model.save - return load_model, (b.get_value()),) # where load_model is tf.keras.models.load_model + def __reduce__(self): + b = io.BytesIO() + self.save(b) + return (load_model, (b.get_value(), )) ``` -Note how this exactly mirrors the aforementioned blog post regarding PyTorch -([link](https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle)). +This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] +as mentioned in "[Pickle isn't slow, it's a protocol]." This would however require extensive work to refactor the entire SaveModel ecosystem to allow the user to specify a file-like object instead of only a folder name. -By implementing this in all of Keras' base classes, things will automatically work -with custom metrics and subclassed models. +[pytorch#9184]:https://github.com/pytorch/pytorch/pull/9184 ### Alternatives Considered -The only real alternative is to: +Of course, one method is to ask users to monkey-patch Keras models themselves. +This would hold for libraries too. Clearly, this is unreasonable. Regardless, +some libraries like Dask Distributed have already implemented custom serialization +protocols ([distributed/protocol/keras.py]). + +#### Other pickle implementations -1. Ask all libraries that currently use pickle to make a special case for each - Keras object and figure out how each Keras object prefers to be serialized - (see use of `serialize` vs `Model.save` above). -2. Ask all users to learn the above as well. -3. Override `__reduce_ex__` to give a user friendly warning instead of failing - cryptically. +The Pickle protocol supports two features: + +1. In-memory copying of live objects: via Python's `copy` module. This falls back to (2) below. +2. Serialization to arbitrary IO (memory or disk): via Python's `pickle` module. + +This proposal seeks to take the conservative approach at least initially and +only implement (2) above since (1) can always fall back to (2) and using only +(2) alleviates any concerns around references to freed memory in the C++ +portions of TF and other such bugs. + +This said, for situations where the user is making an in-memory copy of an object and it might +even be okay to keep around references to non-Python objects, a separate approach that optimizes +(1) would be warranted. This RFC does not seek to address this problem. Hence +this RFC is generally not concerned with: + +* Issues arising from C++ references. These cannot be kept around when + serializing to a binary file stream. +* Performance of the serialization/deserialization. ### Performance Implications @@ -196,22 +220,23 @@ The only real alternative is to: ### Dependencies -* Dependencies: does this proposal add any new dependencies to TensorFlow? **NO** +* Dependencies: does this proposal add any new dependencies to TensorFlow? + * No * Dependent projects: are there other areas of TensorFlow or things that use - TensorFlow (TFX/pipelines, TensorBoard, etc.) that this affects? **This - should not affect anything** + TensorFlow (TFX/pipelines, TensorBoard, etc.) that this affects? + * This should not affect those libraries. It will affect libraries + further downstream like Dask-ML and Ray Tune. ### Engineering Impact * Do you expect changes to binary size / startup time / build time / test - times? **NO** + times? + * No * Who will maintain this code? Is this code in its own buildable unit? Can this code be tested in its own? Is visibility suitably restricted to only a small API surface for others to use? - -This code depends on existing Keras/TF methods. As long as those are maintained -and don't break, this code will not break. The new API surface area is very -small. + * This code depends on existing Keras/TF methods. This code will not break + presuming they are maintained (the new API surface area is very small). ### Platforms and Environments @@ -219,16 +244,16 @@ small. why is that ok? Will it work on embedded/mobile? Does it impact automatic code generation or mobile stripping tooling? Will it work with transformation tools? + * Yes, as long as a Python backend is available. * Execution environments (Cloud services, accelerator hardware): what impact do you expect and how will you confirm? - -This will work on anything that is running Python >= 2.7 (as far as I can tell, -the pickle protocol has not changed since then). + * We don't see any impact. ### Best Practices * Does this proposal change best practices for some aspect of using/developing - TensorFlow? How will these changes be communicated/enforced? **NO** + TensorFlow? How will these changes be communicated/enforced? + * No ### Tutorials and Examples @@ -240,16 +265,20 @@ end to end implementations and tests for all of this. * Does the design conform to the backwards & forwards compatibility [requirements](https://www.tensorflow.org/programmers_guide/version_compat)? - **YES** - -* How will this proposal interact with other parts of the TensorFlow Ecosystem? - * How will it work with TFLite? *N/A* - * How will it work with distribution strategies? - *N/A, unlesss the model is still being trained. Then, this enables custom - serialization methods, which might enable support for other distribution strategies.* - * How will it interact with tf.function? *N/A* - * Will this work on GPU/TPU? *N/A* - * How will it serialize to a SavedModel? *Not applicable, and almost a circular question.* + * Yes + +> *How will this proposal interact with other parts of the TensorFlow Ecosystem?* + +* How will it work with TFLite? + * N/A +* How will it work with distribution strategies? + * This enables use of other serialization libraries, which might enable support for other distribution strategies. +* How will it interact with tf.function? + * N/A +* Will this work on GPU/TPU? + * N/A +* How will it serialize to a SavedModel? + * Not applicable, and almost a circular question. ### User Impact From a882e8cc5ffa7c821d61081d47e7a4aaa03f466b Mon Sep 17 00:00:00 2001 From: Scott Date: Thu, 3 Sep 2020 23:59:18 -0500 Subject: [PATCH 06/26] Edits --- rfcs/20200902-pickle-for-keras/README.md | 61 ++++++++++++------------ 1 file changed, 31 insertions(+), 30 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras/README.md index 18a404e90..291f9db4d 100644 --- a/rfcs/20200902-pickle-for-keras/README.md +++ b/rfcs/20200902-pickle-for-keras/README.md @@ -16,15 +16,14 @@ Implement support for Python's Pickle protocol within Keras. > *Why this is a valuable problem to solve? What background information is needed to show how this design addresses the problem?* -The specific motivation for this RFC comes from supporting Keras models in +The specific motivation for this RFC: we want to use Keras models in Dask-ML's and Ray's hyperparameter optimization. More generaly, support for serialization with the Pickle protocol will enable: -* Using Keras with distributed systems and custom parallelization - strategies/libraries (e.g, Python's `multiprocessing`, Dask, Ray or IPython - parallel). -* Saving Keras models to disk with custom serialization libraries (Joblib, - Dill, etc), which is most common when using a Keras model as part of a +* Using Keras with other parallelization libraries like Python's + `multiprocessing`, Dask, Ray or IPython parallel. +* Saving Keras models to disk with custom serialization libraries like Joblib or + Dill. This is common when using a Keras model as part of a Scikit-Learn pipeline or with their hyperparameter searches. * Copying Keras models with Python's built-in `copy.deepcopy`. @@ -34,15 +33,15 @@ consensus around implementing them consistently and efficiently. See "[Pickle isn't slow, it's a protocol]" for more detail (notably, this post focuses on having an efficent Pickle implementation for PyTorch). Without these protocols, it's necessary for each library to implement a custom serialization method -(e.g, Dask Distributed has a custom serialization method at +(e.g, Dask Distributed has a custom serialization method for Keras at [distributed/protocol/keras.py]) [distributed/protocol/keras.py]:https://github.com/dask/distributed/blob/73fa9bd1bd7dcb4ceed72cdbdc6dd4b92f887521/distributed/protocol/keras.py This request is *not* advocating for use of Pickle while saving or sharing Keras models. We believe the efficient, secure and stable methods in TF should -be used for that. We are proposing to add a Pickle implementation that uses the -same efficient method. +be used for that. Instead, we are proposing to add a Pickle implementation to +support wider usage in the Python ecosystem. [Pickle isn't slow, it's a protocol]:https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle @@ -52,9 +51,17 @@ this? What related work exists?* Users trying to use distributed systems (e.g, Ray or Dask) with Keras are affected. In our experience, this is common in hyperparameter optimization. In general, having Pickle support means a better experience, especially when using -Keras with other libraries. Briefly, this when using a Keras model in -a Scikit-Learn pipeline or when using a custom parallelization like Joblib or Dask. -Detail is give in "User Benefit." +Keras with other libraries. Briefly, implementation of this RFC will make the following possible: + +* Saving a Scikit-Learn pipeline to disk if it includes a Keras model +* Using custom parallelization like Joblib or Dask. + +More use cases and examples are give in "User Benefit." + +Related work is in [SciKeras], which brings a Scikit-Learn API +to Keras. Pickle is relevant because Scikit-Learn requires that estimators must be able to be pickled ([source][skp]). +As such, SciKeras has an implementation of `__reduce_ex__`, which is also in +[tensorflow#39609]. [dask-ml#534]:https://github.com/dask/dask-ml/issues/534 [SO#51110834]:https://stackoverflow.com/questions/51110834/cannot-pickle-dill-a-keras-object @@ -65,12 +72,6 @@ Detail is give in "User Benefit." [skper]:https://scikit-learn.org/stable/modules/model_persistence.html#persistence-example [TF#33204]:https://github.com/tensorflow/tensorflow/issues/33204 [TF#34697]:https://github.com/tensorflow/tensorflow/issues/34697 - - -Related work is in [SciKeras], which brings a Scikit-Learn API -to Keras. Scikit-Learn estimators must be able to be pickled ([source][skp]). -As such, SciKeras has an implementation of `__reduce_ex__`, which is also in -[tensorflow#39609]. [tensorflow#39609]:https://github.com/tensorflow/tensorflow/pull/39609 [SciKeras]:https://github.com/adriangb/scikeras @@ -91,20 +92,20 @@ Examples that could be resolved using `Model.save` (but the user tried pickle fi > How will users (or other contributors) benefit from this work? What would be the headline in the release notes or blog post? -One blog post headline: "Keras models can be used in advanced hyperparameter -optimization with Ray Tune or Dask-ML." +One blog post headline: "Keras models can be used with the advanced +hyperparameter optimization techniques found in Dask-ML and Ray Tune." Users will also benefit with easier usage; they won't run into any of these errors: -* Scikit-Learn recommends using Joblib for model persistence ([source][skper]). - That means people try to save Scikit-Learn estimators to disk with Joblib. +* People try to save Scikit-Learn meta-estimators with Keras components using + the serialization libraries Joblib or Dill. This fails because Keras models can not be serialized without a custom - method. Examples of failures include [SO#59872509], [SO#37984304] and - [SO#48295661], and [SO#51110834] uses a different serialization library (dill). + method. Examples include [SO#59872509], [SO#37984304] and + [SO#48295661], and [SO#51110834]. * Using custom parallelization strategies requires serialization support through - Pickle; however, many parallelization libraries like Joblib and Dask don't - special case Keras models. Relevant errors are most common in hyperparameter + Pickle; however, many parallelization libraries don't + special case Keras models (e.g, Joblib). Relevant errors are most common in hyperparameter optimization with Scikit-Learn's parallelization through Joblib ([TF#33204] and [TF#34697]) or parallelization through Dask ([dask-ml#534]). * Lack of Pickle support can complicate saving training history like in @@ -115,8 +116,8 @@ This RFC would resolve these issues. ## Design Proposal -The general proposal is to implement the pickle protocol using existing Keras -saving functionality as a backend. For example, adding pickle to TF Metrics +We propose implementing the Pickle protocol using the existing Keras +saving functionality as a backend. For example, adding pickle support to TF Metrics is as simple as the following: ``` python @@ -170,7 +171,7 @@ class Model: def __reduce__(self): b = io.BytesIO() self.save(b) - return (load_model, (b.get_value(), )) + return load_model, (b.get_value(), ) ``` This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] @@ -215,7 +216,7 @@ this RFC is generally not concerned with: implemented in TF. * For cases where the user was going to pickle anyway, this will be faster because it uses TF's methods instead of letting Python deal with it naively. -* Tests will consist of running `new_model = pickle.loads(pickle.loads(model))` +* Tests will consist of running `new_model = pickle.loads(pickle.dumps(model))` and then doing checks on `new_model`. ### Dependencies From 610d13f499474f321f42e96957dc8c1b7987322d Mon Sep 17 00:00:00 2001 From: Scott Date: Fri, 4 Sep 2020 00:02:14 -0500 Subject: [PATCH 07/26] Single file --- .../README.md => 20200902-pickle-for-keras.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename rfcs/{20200902-pickle-for-keras/README.md => 20200902-pickle-for-keras.md} (100%) diff --git a/rfcs/20200902-pickle-for-keras/README.md b/rfcs/20200902-pickle-for-keras.md similarity index 100% rename from rfcs/20200902-pickle-for-keras/README.md rename to rfcs/20200902-pickle-for-keras.md From a06e2ac8ee000e0af742740733e040422cf90b5d Mon Sep 17 00:00:00 2001 From: Scott Date: Fri, 4 Sep 2020 00:12:18 -0500 Subject: [PATCH 08/26] better wording --- rfcs/20200902-pickle-for-keras.md | 43 ++++++++++++++++--------------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 291f9db4d..bc3408b15 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -13,28 +13,27 @@ Implement support for Python's Pickle protocol within Keras. ## Motivation -> *Why this is a valuable problem to solve? What background information is needed -to show how this design addresses the problem?* +> *Why this is a valuable problem to solve? What background information is +> needed to show how this design addresses the problem?* -The specific motivation for this RFC: we want to use Keras models in -Dask-ML's and Ray's hyperparameter optimization. More generaly, support for -serialization with the Pickle protocol will enable: +The specific motivation for this RFC: we want to use Keras models in Dask-ML's +and Ray's hyperparameter optimization. More generaly, support for serialization +with the Pickle protocol will enable: * Using Keras with other parallelization libraries like Python's `multiprocessing`, Dask, Ray or IPython parallel. -* Saving Keras models to disk with custom serialization libraries like Joblib or - Dill. This is common when using a Keras model as part of a - Scikit-Learn pipeline or with their hyperparameter searches. +* Saving Keras models to disk with custom serialization libraries like Joblib + or Dill. This is common when using a Keras model as part of a Scikit-Learn + pipeline or with their hyperparameter searches. * Copying Keras models with Python's built-in `copy.deepcopy`. -Supporting Pickle will enable wider usage in the Python ecosystem. Ecosystems -of libraries depend strongly on the presence of protocols, and a strong -consensus around implementing them consistently and efficiently. See "[Pickle -isn't slow, it's a protocol]" for more detail (notably, this post focuses on -having an efficent Pickle implementation for PyTorch). Without these protocols, -it's necessary for each library to implement a custom serialization method -(e.g, Dask Distributed has a custom serialization method for Keras at -[distributed/protocol/keras.py]) +Supporting Pickle will enable wider usage in the Python ecosystem because +ecosystems of libraries depend strongly on the presence of protocols. See +"[Pickle isn't slow, it's a protocol]" for more detail (notably, this post +focuses on having an efficent Pickle implementation for PyTorch). Without these +protocols, it's necessary for each library to implement a custom serialization +method. For example, Dask Distributed has a custom serialization method for +Keras at [distributed/protocol/keras.py]. [distributed/protocol/keras.py]:https://github.com/dask/distributed/blob/73fa9bd1bd7dcb4ceed72cdbdc6dd4b92f887521/distributed/protocol/keras.py @@ -43,15 +42,17 @@ Keras models. We believe the efficient, secure and stable methods in TF should be used for that. Instead, we are proposing to add a Pickle implementation to support wider usage in the Python ecosystem. -[Pickle isn't slow, it's a protocol]:https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle +[Pickle isn't slow, it's a +protocol]:https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle -> *Which users are affected by the problem? Why is it a problem? What data supports -this? What related work exists?* +> *Which users are affected by the problem? Why is it a problem? What data +> supports this? What related work exists?* -Users trying to use distributed systems (e.g, Ray or Dask) with Keras are +Users trying to use distributed systems (e.g, Ray or Dask) with Keras are affected. In our experience, this is common in hyperparameter optimization. In general, having Pickle support means a better experience, especially when using -Keras with other libraries. Briefly, implementation of this RFC will make the following possible: +Keras with other libraries. Briefly, implementation of this RFC will make the +following possible: * Saving a Scikit-Learn pipeline to disk if it includes a Keras model * Using custom parallelization like Joblib or Dask. From b98f2be0c8be991d2b5b8245d0f072625a20bcc6 Mon Sep 17 00:00:00 2001 From: Scott Date: Fri, 4 Sep 2020 00:18:45 -0500 Subject: [PATCH 09/26] spell check --- rfcs/20200902-pickle-for-keras.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index bc3408b15..a0de8eb1e 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -17,7 +17,7 @@ Implement support for Python's Pickle protocol within Keras. > needed to show how this design addresses the problem?* The specific motivation for this RFC: we want to use Keras models in Dask-ML's -and Ray's hyperparameter optimization. More generaly, support for serialization +and Ray's hyperparameter optimization. More generally, support for serialization with the Pickle protocol will enable: * Using Keras with other parallelization libraries like Python's @@ -30,7 +30,7 @@ with the Pickle protocol will enable: Supporting Pickle will enable wider usage in the Python ecosystem because ecosystems of libraries depend strongly on the presence of protocols. See "[Pickle isn't slow, it's a protocol]" for more detail (notably, this post -focuses on having an efficent Pickle implementation for PyTorch). Without these +focuses on having an efficient Pickle implementation for PyTorch). Without these protocols, it's necessary for each library to implement a custom serialization method. For example, Dask Distributed has a custom serialization method for Keras at [distributed/protocol/keras.py]. @@ -177,7 +177,7 @@ class Model: This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] as mentioned in "[Pickle isn't slow, it's a protocol]." -This would however require extensive work to refactor the entire SaveModel +This would however require extensive work to refactor the entire `SaveModel` ecosystem to allow the user to specify a file-like object instead of only a folder name. From 8ed3397ffe0d024b6a6d49d81da00498312ec29c Mon Sep 17 00:00:00 2001 From: Scott Date: Fri, 4 Sep 2020 00:39:22 -0500 Subject: [PATCH 10/26] Link to blog.dask --- rfcs/20200902-pickle-for-keras.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index a0de8eb1e..bd1dac4db 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -42,8 +42,7 @@ Keras models. We believe the efficient, secure and stable methods in TF should be used for that. Instead, we are proposing to add a Pickle implementation to support wider usage in the Python ecosystem. -[Pickle isn't slow, it's a -protocol]:https://matthewrocklin.com/blog/work/2018/07/23/protocols-pickle +[Pickle isn't slow, it's a protocol]:https://blog.dask.org/2018/07/23/protocols-pickle > *Which users are affected by the problem? Why is it a problem? What data > supports this? What related work exists?* From cd6278ad980f767a621fe454be50896b1c1ff994 Mon Sep 17 00:00:00 2001 From: Scott Date: Fri, 4 Sep 2020 00:43:07 -0500 Subject: [PATCH 11/26] Reorganizing --- rfcs/20200902-pickle-for-keras.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index bd1dac4db..5544db32d 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -28,12 +28,12 @@ with the Pickle protocol will enable: * Copying Keras models with Python's built-in `copy.deepcopy`. Supporting Pickle will enable wider usage in the Python ecosystem because -ecosystems of libraries depend strongly on the presence of protocols. See -"[Pickle isn't slow, it's a protocol]" for more detail (notably, this post -focuses on having an efficient Pickle implementation for PyTorch). Without these -protocols, it's necessary for each library to implement a custom serialization -method. For example, Dask Distributed has a custom serialization method for -Keras at [distributed/protocol/keras.py]. +Python's ecosystems of libraries depend strongly on the presence of protocols. +Without these protocols, it's necessary for each library to implement a custom +serialization method. For example, Dask Distributed has a custom serialization +method for Keras at [distributed/protocol/keras.py]. See "[Pickle isn't slow, +it's a protocol]" for more detail (notably, this post focuses on having an +efficient Pickle implementation for PyTorch). [distributed/protocol/keras.py]:https://github.com/dask/distributed/blob/73fa9bd1bd7dcb4ceed72cdbdc6dd4b92f887521/distributed/protocol/keras.py From 265aee463485ed600ac6b2490127415851866994 Mon Sep 17 00:00:00 2001 From: Scott Date: Fri, 4 Sep 2020 00:44:52 -0500 Subject: [PATCH 12/26] Reorganizing --- rfcs/20200902-pickle-for-keras.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 5544db32d..84a358a19 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -30,10 +30,10 @@ with the Pickle protocol will enable: Supporting Pickle will enable wider usage in the Python ecosystem because Python's ecosystems of libraries depend strongly on the presence of protocols. Without these protocols, it's necessary for each library to implement a custom -serialization method. For example, Dask Distributed has a custom serialization -method for Keras at [distributed/protocol/keras.py]. See "[Pickle isn't slow, -it's a protocol]" for more detail (notably, this post focuses on having an -efficient Pickle implementation for PyTorch). +serialization method for every other library. For example, Dask Distributed has +a custom serialization method for Keras at [distributed/protocol/keras.py]. +See "[Pickle isn't slow, it's a protocol]" for more detail (notably, this post +focuses on having an efficient Pickle implementation for PyTorch). [distributed/protocol/keras.py]:https://github.com/dask/distributed/blob/73fa9bd1bd7dcb4ceed72cdbdc6dd4b92f887521/distributed/protocol/keras.py From 3ccd7c58bb5956bac107e4f6105319e24a915ee3 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Fri, 4 Sep 2020 10:53:43 -0500 Subject: [PATCH 13/26] Add title and status --- rfcs/20200902-pickle-for-keras.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 84a358a19..c35eb84a6 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -1,6 +1,6 @@ -# Title of RFC +# Implement Python's Pickle Protocol -| Status | (Proposed / Accepted / Implemented / Obsolete) | +| Status | Proposed | :-------------- |:---------------------------------------------------- | | **RFC #** | [286](https://github.com/tensorflow/community/pull/286) | | **Author(s)** | Adrian Garcia Badaracco ({firstname}@{firstname}gb.com), Scott Sievert (tf-rfc@stsievert.com) | From 42d5a07446cde487528c6194c5dd6b9349f13373 Mon Sep 17 00:00:00 2001 From: Scott Date: Sun, 6 Sep 2020 14:21:11 -0500 Subject: [PATCH 14/26] Edit title --- rfcs/20200902-pickle-for-keras.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index c35eb84a6..da2abba5c 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -1,4 +1,4 @@ -# Implement Python's Pickle Protocol +# Support Pickle, Python's serialization protocol | Status | Proposed | :-------------- |:---------------------------------------------------- | From a396a6e20d483e57e57bc0f021f70958fc07f594 Mon Sep 17 00:00:00 2001 From: Scott Date: Sun, 6 Sep 2020 14:29:19 -0500 Subject: [PATCH 15/26] Add link to Dask blog post --- rfcs/20200902-pickle-for-keras.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index da2abba5c..8e0631ae3 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -1,4 +1,4 @@ -# Support Pickle, Python's serialization protocol +# Support for Pickle, Python's serialization protocol | Status | Proposed | :-------------- |:---------------------------------------------------- | @@ -9,7 +9,7 @@ ## Objective -Implement support for Python's Pickle protocol within Keras. +Implement support for Pickle, Python's serialization protocol within Keras. ## Motivation @@ -92,8 +92,12 @@ Examples that could be resolved using `Model.save` (but the user tried pickle fi > How will users (or other contributors) benefit from this work? What would be the headline in the release notes or blog post? -One blog post headline: "Keras models can be used with the advanced -hyperparameter optimization techniques found in Dask-ML and Ray Tune." +One blog post headline: "Keras models can be used with the advanced +hyperparameter optimization techniques found in Dask-ML and Ray Tune." This has +already been mentioned in "Framework support" of [a Dask blog post][dbp] +comparing Dask-ML's hyperparameter optimization with Ray's tune-sklearn. + +[dbp]:https://blog.dask.org/2020/08/06/ray-tune#framework-support Users will also benefit with easier usage; they won't run into any of these errors: From a7f963cb417f748d3942c68f8b019ebf5b84f181 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Fri, 11 Sep 2020 15:17:37 -0400 Subject: [PATCH 16/26] Update 20200902-pickle-for-keras.md --- rfcs/20200902-pickle-for-keras.md | 45 +++++++++++++------------------ 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 8e0631ae3..8072ca3c0 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -147,42 +147,35 @@ m2 = pickle.loads(pickle.dumps(m1)) assert m1 == m2 # TODO: or some other check ``` -For more complex objects like `tf.keras.Model`, there are two options with `__reduce__`: - -1. Implement a similar approach that saves the weights and - config separately ([notebook example]). -2. Use `Model.save` as the backend. - -Option (1) is similar to Dask Distributed's custom serialization of Keras models in [distributed/protocol/keras.py]. Option 2 would require implementing support for -serializing to memory in `Model.save`. - -Option (2) is preferred because it seems to be a better solution -since `Model.save` is the recommended method for model persistence. -It would look something like the following, if `Model.save` worked -with `io.BytesIO()`: - -[notebook example]:https://colab.research.google.com/drive/14ECRN8ZQDa1McKri2dctlV_CaPkE574I?authuser=1#scrollTo=qlXDfJObNXVf - +For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce__`: ``` python -# tensorflow/python/.../model.py TODO make sure correct -import io +# tensorflow/python/.../training.py from tf.keras.models import load_model class Model: ... - - def __reduce__(self): - b = io.BytesIO() - self.save(b) - return load_model, (b.get_value(), ) + def __reduce__(self, protocol): + save_folder = f"tmp/saving/{id(self)}" + ram_prefix = "ram://" + temp_ram_location = os.path.join(ram_prefix, save_folder) + self.save(temp_ram_location) + b = tf.io.gfile.read_folder(temp_ram_location) + return self._reconstruct_pickle, (np.asarray(memoryview(b)), ) + + @classmethod + def _reconstruct_pickle(cls, obj): + save_folder = f"tmp/saving/{id(obj)}" + ram_prefix = "ram://" + temp_ram_location = os.path.join(ram_prefix, save_folder) + tf.io.gfile.write_folder(temp_ram_location, b) + return load_model(temp_ram_location) ``` This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] as mentioned in "[Pickle isn't slow, it's a protocol]." -This would however require extensive work to refactor the entire `SaveModel` -ecosystem to allow the user to specify a file-like object instead of only a -folder name. + +Small augmentations to TensorFlow's `io` module would be required, as discussed in [tensorflow#39609]. [pytorch#9184]:https://github.com/pytorch/pytorch/pull/9184 From 0d4d273f80a19fe65fc675441ab3368538d92f7b Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Fri, 11 Sep 2020 15:26:22 -0400 Subject: [PATCH 17/26] Clarify support for Pickle 5 --- rfcs/20200902-pickle-for-keras.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 8072ca3c0..e5f8bca45 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -156,22 +156,20 @@ from tf.keras.models import load_model class Model: ... def __reduce__(self, protocol): - save_folder = f"tmp/saving/{id(self)}" - ram_prefix = "ram://" - temp_ram_location = os.path.join(ram_prefix, save_folder) - self.save(temp_ram_location) - b = tf.io.gfile.read_folder(temp_ram_location) + self.save(f"ram://tmp/saving/{id(self)") + b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)") return self._reconstruct_pickle, (np.asarray(memoryview(b)), ) @classmethod def _reconstruct_pickle(cls, obj): - save_folder = f"tmp/saving/{id(obj)}" - ram_prefix = "ram://" - temp_ram_location = os.path.join(ram_prefix, save_folder) - tf.io.gfile.write_folder(temp_ram_location, b) + tf.io.gfile.write_folder(f"ram://tmp/saving/{id(obj)", b) return load_model(temp_ram_location) ``` +By wrapping the pickled object within a `Numpy` array, pickling will support +pickle protocol 5 for zero-copy pickling. This provides an immediate +performance improvement for many use cases. + This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] as mentioned in "[Pickle isn't slow, it's a protocol]." From d0a92958abc62b18b10d352f060f223303085b69 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Tue, 15 Sep 2020 11:50:38 -0400 Subject: [PATCH 18/26] fix typo --- rfcs/20200902-pickle-for-keras.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index e5f8bca45..78fc048f5 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -157,12 +157,12 @@ class Model: ... def __reduce__(self, protocol): self.save(f"ram://tmp/saving/{id(self)") - b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)") + b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)}") return self._reconstruct_pickle, (np.asarray(memoryview(b)), ) @classmethod def _reconstruct_pickle(cls, obj): - tf.io.gfile.write_folder(f"ram://tmp/saving/{id(obj)", b) + tf.io.gfile.write_folder(f"ram://tmp/saving/{id(obj)}", b) return load_model(temp_ram_location) ``` From 81d3b7b68e84cd0ac71b4c3f8036af9516e56876 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Tue, 15 Sep 2020 11:54:25 -0400 Subject: [PATCH 19/26] __reduce__ -> __reduce_ex__ for PEP 574 support --- rfcs/20200902-pickle-for-keras.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 78fc048f5..fa97bbc30 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -128,16 +128,16 @@ is as simple as the following: from tf.keras.metrics import Metric, serialize, deserialize class NewMetric(Metric): - def __reduce__(self, protocol): + def __reduce_ex__(self, protocol): return deserialize, (serialize(self),) ``` This implementation adds support for the Pickle protocol, which supports serialization -to arbitrary IO, either memory or disk. The `__reduce__` special method can return +to arbitrary IO, either memory or disk. The `__reduce_ex__` special method can return the string that would have been written to disk and the function to load that string into memory ([docs][reduce_docs]). Now, the tests pass with `NewMetric`: -[reduce_docs]:https://docs.python.org/3/library/pickle.html#object.__reduce__ +[reduce_docs]:https://docs.python.org/3/library/pickle.html#object.__reduce_ex__ ``` python import pickle @@ -147,7 +147,7 @@ m2 = pickle.loads(pickle.dumps(m1)) assert m1 == m2 # TODO: or some other check ``` -For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce__`: +For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce_ex__`: ``` python # tensorflow/python/.../training.py @@ -155,7 +155,7 @@ from tf.keras.models import load_model class Model: ... - def __reduce__(self, protocol): + def __reduce_ex__(self, protocol): self.save(f"ram://tmp/saving/{id(self)") b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)}") return self._reconstruct_pickle, (np.asarray(memoryview(b)), ) From a160d72e6d3075d37bbed0660ccb0658f970c986 Mon Sep 17 00:00:00 2001 From: Scott Sievert Date: Fri, 18 Sep 2020 22:41:43 -0500 Subject: [PATCH 20/26] Update 20200902-pickle-for-keras.md --- rfcs/20200902-pickle-for-keras.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index fa97bbc30..e62261c62 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -150,11 +150,9 @@ assert m1 == m2 # TODO: or some other check For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce_ex__`: ``` python -# tensorflow/python/.../training.py from tf.keras.models import load_model -class Model: - ... +class NewModel(Model): def __reduce_ex__(self, protocol): self.save(f"ram://tmp/saving/{id(self)") b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)}") @@ -166,15 +164,14 @@ class Model: return load_model(temp_ram_location) ``` +Small augmentations to TensorFlow's `io` module would be required, as discussed in [tensorflow#39609]. + By wrapping the pickled object within a `Numpy` array, pickling will support pickle protocol 5 for zero-copy pickling. This provides an immediate -performance improvement for many use cases. - -This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] +performance improvement for many use cases. This almost exactly mirrors the PyTorch +implementation of Pickle support in [pytorch#9184] as mentioned in "[Pickle isn't slow, it's a protocol]." -Small augmentations to TensorFlow's `io` module would be required, as discussed in [tensorflow#39609]. - [pytorch#9184]:https://github.com/pytorch/pytorch/pull/9184 ### Alternatives Considered From 957c1ba2bbd65e54c5312f1003223286f3be0821 Mon Sep 17 00:00:00 2001 From: Scott Sievert Date: Fri, 18 Sep 2020 22:45:26 -0500 Subject: [PATCH 21/26] Delete metric tests --- rfcs/20200902-pickle-for-keras.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index e62261c62..9d33bbd0b 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -135,18 +135,9 @@ class NewMetric(Metric): This implementation adds support for the Pickle protocol, which supports serialization to arbitrary IO, either memory or disk. The `__reduce_ex__` special method can return the string that would have been written to disk and the function to load that string into memory ([docs][reduce_docs]). -Now, the tests pass with `NewMetric`: [reduce_docs]:https://docs.python.org/3/library/pickle.html#object.__reduce_ex__ -``` python -import pickle -m1 = NewMetric() # TODO: is this correct? - -m2 = pickle.loads(pickle.dumps(m1)) -assert m1 == m2 # TODO: or some other check -``` - For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce_ex__`: ``` python From 0c6ef25421ae08d905acf4aef11f27b82192b63b Mon Sep 17 00:00:00 2001 From: Scott Sievert Date: Sat, 19 Sep 2020 16:21:54 -0500 Subject: [PATCH 22/26] Change metric file --- rfcs/20200902-pickle-for-keras.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 9d33bbd0b..85e4e7e2c 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -125,10 +125,15 @@ saving functionality as a backend. For example, adding pickle support to TF Metr is as simple as the following: ``` python -from tf.keras.metrics import Metric, serialize, deserialize +# tensorflow/python/keras/metrics.py + +@keras_export('keras.metrics.Metric') # line 80 +@six.add_metaclass(abc.ABCMeta) +class Metric(base_layer.Layer): + ... -class NewMetric(Metric): def __reduce_ex__(self, protocol): + # the deserialized/serialize functions are defined in this file return deserialize, (serialize(self),) ``` From de2d703109448d02b86b50cbac87559a37dd431e Mon Sep 17 00:00:00 2001 From: Scott Sievert Date: Sat, 19 Sep 2020 16:31:59 -0500 Subject: [PATCH 23/26] Model file --- rfcs/20200902-pickle-for-keras.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 85e4e7e2c..ba74d79de 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -132,9 +132,9 @@ is as simple as the following: class Metric(base_layer.Layer): ... - def __reduce_ex__(self, protocol): - # the deserialized/serialize functions are defined in this file - return deserialize, (serialize(self),) + def __reduce_ex__(self, protocol): + # the deserialized/serialize functions are defined in this file + return deserialize, (serialize(self),) ``` This implementation adds support for the Pickle protocol, which supports serialization @@ -146,9 +146,13 @@ the string that would have been written to disk and the function to load that st For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce_ex__`: ``` python -from tf.keras.models import load_model +# tensorflow/python/keras/engine/training.py +... +from tesorflow.python.keras.models import load_model + +class Model(base_layer.Layer, version_utils.ModelVersionSelector): # line 131 + ... -class NewModel(Model): def __reduce_ex__(self, protocol): self.save(f"ram://tmp/saving/{id(self)") b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)}") @@ -160,7 +164,7 @@ class NewModel(Model): return load_model(temp_ram_location) ``` -Small augmentations to TensorFlow's `io` module would be required, as discussed in [tensorflow#39609]. +Small augmentations to TensorFlow's IO module would be required, as discussed in [tensorflow#39609]. By wrapping the pickled object within a `Numpy` array, pickling will support pickle protocol 5 for zero-copy pickling. This provides an immediate From b393fe9d939726dabd73ab90f25134cd11c08a4b Mon Sep 17 00:00:00 2001 From: Scott Sievert Date: Sat, 19 Sep 2020 16:35:35 -0500 Subject: [PATCH 24/26] Update 20200902-pickle-for-keras.md --- rfcs/20200902-pickle-for-keras.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index ba74d79de..119c2eea5 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -164,13 +164,15 @@ class Model(base_layer.Layer, version_utils.ModelVersionSelector): # line 131 return load_model(temp_ram_location) ``` -Small augmentations to TensorFlow's IO module would be required, as discussed in [tensorflow#39609]. -By wrapping the pickled object within a `Numpy` array, pickling will support -pickle protocol 5 for zero-copy pickling. This provides an immediate -performance improvement for many use cases. This almost exactly mirrors the PyTorch +This almost exactly mirrors the PyTorch implementation of Pickle support in [pytorch#9184] as mentioned in "[Pickle isn't slow, it's a protocol]." +In addition, small augmentations to TensorFlow's IO module will be required (as discussed in [tensorflow#39609]). + +By wrapping the pickled object within a Numpy array, pickling will support +pickle protocol 5 for zero-copy pickling. This provides an immediate +performance improvement for many use cases. [pytorch#9184]:https://github.com/pytorch/pytorch/pull/9184 From 349b64931089b2c6ce922ce12b115324625eb963 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Sat, 19 Sep 2020 16:41:28 -0500 Subject: [PATCH 25/26] Define temp_ram_location --- rfcs/20200902-pickle-for-keras.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 119c2eea5..36366ac9e 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -154,13 +154,15 @@ class Model(base_layer.Layer, version_utils.ModelVersionSelector): # line 131 ... def __reduce_ex__(self, protocol): - self.save(f"ram://tmp/saving/{id(self)") - b = tf.io.gfile.read_folder(f"ram://tmp/saving/{id(self)}") + temp_ram_location = f"ram://tmp/saving/{id(self)}" + self.save(temp_ram_location) + b = tf.io.gfile.read_folder(temp_ram_location) return self._reconstruct_pickle, (np.asarray(memoryview(b)), ) @classmethod def _reconstruct_pickle(cls, obj): - tf.io.gfile.write_folder(f"ram://tmp/saving/{id(obj)}", b) + temp_ram_location = f"ram://tmp/saving/{id(obj)}" + tf.io.gfile.write_folder(temp_ram_location, b) return load_model(temp_ram_location) ``` From 42e4a631c44e5c717eb6d922fef8bdc34152f014 Mon Sep 17 00:00:00 2001 From: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> Date: Mon, 21 Sep 2020 13:54:04 -0500 Subject: [PATCH 26/26] __reduce_ex__ -> __reduce__ --- rfcs/20200902-pickle-for-keras.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/rfcs/20200902-pickle-for-keras.md b/rfcs/20200902-pickle-for-keras.md index 36366ac9e..e2afa4db1 100644 --- a/rfcs/20200902-pickle-for-keras.md +++ b/rfcs/20200902-pickle-for-keras.md @@ -60,7 +60,7 @@ More use cases and examples are give in "User Benefit." Related work is in [SciKeras], which brings a Scikit-Learn API to Keras. Pickle is relevant because Scikit-Learn requires that estimators must be able to be pickled ([source][skp]). -As such, SciKeras has an implementation of `__reduce_ex__`, which is also in +As such, SciKeras has an implementation of `__reduce__`, which is also in [tensorflow#39609]. [dask-ml#534]:https://github.com/dask/dask-ml/issues/534 @@ -132,18 +132,18 @@ is as simple as the following: class Metric(base_layer.Layer): ... - def __reduce_ex__(self, protocol): + def __reduce__(self, protocol): # the deserialized/serialize functions are defined in this file return deserialize, (serialize(self),) ``` This implementation adds support for the Pickle protocol, which supports serialization -to arbitrary IO, either memory or disk. The `__reduce_ex__` special method can return +to arbitrary IO, either memory or disk. The `__reduce__` special method can return the string that would have been written to disk and the function to load that string into memory ([docs][reduce_docs]). -[reduce_docs]:https://docs.python.org/3/library/pickle.html#object.__reduce_ex__ +[reduce_docs]:https://docs.python.org/3/library/pickle.html#object.__reduce__ -For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce_ex__`: +For `tf.keras.Model`, we can use `SaveModel` as the backend for `__reduce__`: ``` python # tensorflow/python/keras/engine/training.py @@ -153,7 +153,7 @@ from tesorflow.python.keras.models import load_model class Model(base_layer.Layer, version_utils.ModelVersionSelector): # line 131 ... - def __reduce_ex__(self, protocol): + def __reduce__(self, protocol): temp_ram_location = f"ram://tmp/saving/{id(self)}" self.save(temp_ram_location) b = tf.io.gfile.read_folder(temp_ram_location)