How to distinguish datasets spawned by the dataloader moudule?

If both the dataset and dataloader moudules are used as the document, the spawned datasets are the same. But for my situation, in dataset getitem function, I used a model to preprocess the data. If I don’t specify the cuda index, all the models used in dataset for preprocessing will occupy the same gpu. So, how can I get something like index in each dataset.init function to assign different gpus to the models in the dataset.

You could try to use the worker id and use it to move the model to the corresponding device.

Hi, how can I get the worker id, any links suggested?

This code should work inside the Dataset.__getitem__:

    def __getitem__(self, index):
        worker_info = torch.utils.data.get_worker_info()
        if worker_info:
            worker_id = worker_info.id
            print('worker_id {} calling with index {}'.format(worker_id, index))
1 Like