Please refer to the jupyter notebook file for details.
The official Tensorflow guide says that tf.data.Dataset can be directly fed into model.fit in Keras since TF 1.09. However, it is an absolute lie at the moment of TF 1.10, Sept 2018, as you can see from the following images.
I have spent three days to figure out what the problem is, but it turns out that IT IS JUST NOT WORKING. Instead, it is working with TF-Nightly (TF 1.12-dev at this moment) rather than TF 1.10, as discussed here and here.
For those who may suffer from the similar problems that I had, I made this tutorial code to show how you can incorporate tf.data into your Keras model.
**[IMPORTANT] You should first install tf-nightly
or tf-nightly-gpu
via pip install tf-nightly
or pip install tf-nightly-gpu
if you have intalled a tensorflow under 1.11 version
- This code consists of four parts as followings:
Load MNIST dataset and build a simple 2-layer MLP model (784-40-40-10) using Keras.
Assumption: The amount of your data is enough small, thus you can load them on the memory. Solution: Make a tf.data.Dataset and train the model using it.
Assumption: The amount of your data is to large to load on the memory. Solution: Create a TFRecord file first, and make a tf.data.Dataset from it.
- Estimator is a standard ML model for real-world products which is ready to deploy on google cloud.
This code first shows how tf.data can be naturally incorporated into a pre-made estimator provided by Tensorflow APIs.
You may want to build a custom model using Keras and train it using your large-scale data. One of the easiest ways to do it is (i) to write a TF record file, (ii) make a tf.data.Dataset from it, (iii) buile a model using Keras, (iv) convert the model to an estimator using model_to_estimator
, (v) train the model like the previous example.
This code explain how to upload your data from local to a google storage (bucket), get the list of files on the bucket, and train the model using the data on the bucket.
I hope this tutorial could save your time. Feel free to use or change this code for your own purpose. If you are intereted in deep learning research, you can follow me on twitter and have interesting discussions.