Hi,

I have data that consists of numbers as strings, i.e.:

[image c x w x h tensor, image c x w x h tensor, groundtruth number c x w x h tensor, strings],

where tuple of strings is not just “cat”, “dog” etc that can be encoded to 0,1,…, but just some information about where the tensor data came from (i.e. “/data/asdf/qwer.png”).

I’ve been successfully using Pytorch’s Dataset an DataLoader to load the data (excluding strings part) onto 4 GPUs I have. (i.e. with batch size of 32, each gpu gets the 2 image tensors and groundtruth number tensor in size of 8).

Then I wanted to also feed in the strings data as well because I needed to do some processing based on that in forward method of my network, and realized strings cannot be turned into tensors. **The DataLoader is able to group randomly sampled 32 of the strings into a batch successfully, but each of the 4 GPUs are getting the exact same 32 strings, instead of 8 strings into each GPU.**

Is there any way this can be done?

Thanks,