Python – way to stack two tensorflow datasets

numpypythonpython-3.xtensorflowtensorflow-datasets

I want to stack two datasets objects in Tensorflow (rbind function in R). I have created one dataset A from tfRecord files and one dataset B from numpy arrays. Both have same variables. Do you know if there is a way to stack these two datasets to create a bigger one ? Or to create an iterrator that will randomly read data from this two sources ?

Thanks

Best Answer

The tf.data.Dataset.concatenate() method is the closest analog of tf.stack() when working with datasets. If you have two datasets with the same structure (i.e. same types for each component, but possibly different shapes):

dataset_1 = tf.data.Dataset.range(10, 20)
dataset_2 = tf.data.Dataset.range(60, 70)

then you can concatenate them as follows:

combined_dataset = dataset_1.concatenate(dataset_2)