While it is possible to input your data to fit a model using a custom generator, the tf.data
API offers possibilities which may be more comfortable and efficient, making use of strategies such as .cache()
and .prefetch()
(demonstrated in CacheAndPrefetch.ipynb
). This repository contains the implementation of four different loading dataset approaches:
- a) Using
.from_tensor_slices()
- b) Using
.from_generator()
- c) Using
.flow_from_directory()
- d) Using
.tfrecords
These are benchmarked on a segmentation task in LoadingBenchmark.ipynb
.
To execute the benchmarking on your machine, you will need to:
- Create a virtual environment from
requirements.txt
. This is easily managed executingpip install -r requirements.txt
on the terminal. - Install the packages required for GPU computing. Note that the code has only been tested with tensorflow 2.4.0, which works with CUDA 11.0 and cuDNN 8.0. For more combinations, check the link.
- Start a W&B container to log memory usage during training. A quickstart tutorial can be found here.
Please write me if you have any question understanding the code ([email protected]).