Generating Data in Parallel

Hi Guys,

I am trying to generate data in parallel following this tutorial.

This tutorial first assumes that my dataset should be in this format-
training_generator = SomeSingleCoreGenerator('')

I have never stored data in this format, mine data is in


How to do it in the above format, so I can proceed to follow the tutorial in the required format?

I think this example refers to the case where you use the builting and to build your dataset loader.
The dataloader in particular will give you a generator like this.

1 Like

Okay, I have a doubt. Before following the tutorial, I was doing the data parallelism using the official Pytorch:DATA PARALLELISM -( tutorial.
It basically wraps up the model in nn.DataParallel(model).
So nn.DataParallel(model) doesn’t have any effect on loading the input data in parallel and data is still being loaded using a single core only?

The DataParallel module is to run your model in parrallel on multiple GPUs. This is not related to loading data in parallel !

1 Like