September 22, 2021
tf.data.Dataset generators with parallelization: the easy way
Or how to use the new tf.data.Dataset objects as generators for the training of a machine learning model on Tensorflow, with parallelized processing.
scroll

The tf.data pipeline is now the gold standard for building an efficient data pipeline for machine learning applications with TensorFlow. For applications requiring tremendous quantities of training samples such as deep learning, it is often the case that the training data simply cannot fit in memory. In such cases, recourse to generators appears as a natural solution.
This post tries to answer the following question: how can one use the new tf.data.Dataset objects as generators for the training of a machine learning model, with parallelized processing?
Find out in our latest Scortex Tech Blog post! Now on Medium:

tf.data.Dataset generators with parallelization: the easy way
The tf.data pipeline is now the gold standard for building an efficient data pipeline for machine learning applications with TensorFlow. ...
Sep 21, 2021 / Read More