Skip to content

guillemboada/TensorFlowLoadingBenchmark

Repository files navigation

Benchmark of loading dataset approaches in TensorFlow

While it is possible to input your data to fit a model using a custom generator, the tf.data API offers possibilities which may be more comfortable and efficient, making use of strategies such as .cache() and .prefetch() (demonstrated in CacheAndPrefetch.ipynb). This repository contains the implementation of four different loading dataset approaches:

  • a) Using .from_tensor_slices()
  • b) Using .from_generator()
  • c) Using .flow_from_directory()
  • d) Using .tfrecords

These are benchmarked on a segmentation task in LoadingBenchmark.ipynb.

TrainingTimes

Prerequisites

To execute the benchmarking on your machine, you will need to:

  • Create a virtual environment from requirements.txt. This is easily managed executing pip install -r requirements.txt on the terminal.
  • Install the packages required for GPU computing. Note that the code has only been tested with tensorflow 2.4.0, which works with CUDA 11.0 and cuDNN 8.0. For more combinations, check the link.
  • Start a W&B container to log memory usage during training. A quickstart tutorial can be found here.

Please write me if you have any question understanding the code ([email protected]).

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published