Hi,
I don’t have deep knowledge about Tensorflow and read about a utility called ‘TFRecord’. Is it the counterpart to ‘DataLoader’ in Pytorch ?
Best Regards
Hi,
I don’t have deep knowledge about Tensorflow and read about a utility called ‘TFRecord’. Is it the counterpart to ‘DataLoader’ in Pytorch ?
Best Regards
No, TfRecord
is different thing compared to DataLoader
.
Tf.data is counter part to DataLoader
.
Both of them can read different format of data (numpy, text, path_to_images)
TfRecord
is much more like DataBase which you can create before training and read from it during it. Main advantage is that you are not reading many small files but several bigger files (it should be faster). And TfRecord
is special structure supported by TF.
In PyTorch you can use any known DataBase for reading the data. It up to you what you would choose.
The term ‘DataBase’ in context of Pytorch is ‘torch.utils.data.Dataset’ class instance … isn’t it ?
No, saying DataBase
I mean SQL
, LMDB
database. Read here: What's the best way to load large data?
So DataBase as a external, general term.
torch.utils.data.Dataset
can define how we want parse and transform data (ex. use LMDB
and use DataAugmentation
)
Totally clear now … Thanks a lot