Hello,
I read Thomas Woolf’s article on balanced loads when using multiple GPUs, and I would like to adapt it for my training. He mentioned that when using DataParallelModel
and compared to torch.nn.DataParallel
, the predictions in the forward pass (predictions = parallel_model(inputs)
) would be a tuple of n tensors, with each tensor being located on a specific GPU (There are n GPUs used for training).
Here is the code for his implementation. To recap, I simply wrap the model like this:
parallel_model = DataParallelModel(model)
predictions =parallel_model(inputs)
This will affect how I currently compute the accuracy because I use torch.max to get the prediction from the tensors like this:
_, pred = torch.max(predictions.data, dim=1)
correct += (pred == label).sum().item()
total += label.size(0)
std_acc= (correct/total) * 100
I am thinking this can be solved by iterating through the tuple predictions
like this:
for i in range(len(gpu_list)):
_, pred = torch.max(predictions[i].data, dim=1)
correct += (pred == label).sum().item()
total += label.size(0)
std_acc= (correct/total) * 100
However, each tensor is located on a different GPU, so iterating through it doesn’t seem to make sense. How can I access the pred
for each tensor in predictions
given that they are in different GPUs, and how can I compare them to their original labels?
Or is there a better way to calculate the accuracy? Thank you.