-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin API for generator support with tensorflow-io #151
Comments
The idea comes from one issue I faced before. At one point I was doing some research in defense IDN homograph attacks. We needed images for each unicode character for analysis. That is around 1 million pictures. At one point we opened many processes on quite a few machines to generate the pictures concurrently. Which is kind of tedious. We probably could actually use tensorflow to do the task, then use tensorflow for training directly. |
I really like the idea. |
Note #246 could be related. |
It is possible to use generator to output tf.data in tensorflow through
tf.data.Dataset.from_generator
. The implementation follows similar paths astf.py_func
so it does have some limitations.A plugin-like API may not be difficult to implement in order to output tf.data from within C/C++. Essentially, we could define a C API for a plugin to be exposed. Each plugin will be built as a dynamic shared library (.so or .dll). The shared library will be loaded dynamically so that the C API could be called to generate the data. In the kernel of tensorflow-io we use the generated data to output to tf.data. (to be used by tf.keras etc.)
Note we define C API for plugin to expose, but there is no limitation on the implementation. A plugin could be implemented in C++ internally, and expose the C API.
Note: This could be part of the GSoC (TensorFlow) project:
https://docs.google.com/document/d/1zT57PFMGZ04A4CvHxAKVpMTgXjsO92_oKeSKwZMc0Gs/edit
The text was updated successfully, but these errors were encountered: