Calculating Accuracy in a Single machine, multi-GPU setup


I read Thomas Woolf’s article on balanced loads when using multiple GPUs, and I would like to adapt it for my training. He mentioned that when using DataParallelModel and compared to torch.nn.DataParallel, the predictions in the forward pass (predictions = parallel_model(inputs)) would be a tuple of n tensors, with each tensor being located on a specific GPU (There are n GPUs used for training).

Here is the code for his implementation. To recap, I simply wrap the model like this:

parallel_model = DataParallelModel(model) 
predictions =parallel_model(inputs)

This will affect how I currently compute the accuracy because I use torch.max to get the prediction from the tensors like this:

_, pred = torch.max(, dim=1)
correct += (pred == label).sum().item()
total += label.size(0)
std_acc= (correct/total) * 100

I am thinking this can be solved by iterating through the tuple predictions like this:

for i in range(len(gpu_list)):
   _, pred = torch.max(predictions[i].data, dim=1)
  correct += (pred == label).sum().item()
  total += label.size(0)
std_acc= (correct/total) * 100

However, each tensor is located on a different GPU, so iterating through it doesn’t seem to make sense. How can I access the pred for each tensor in predictions given that they are in different GPUs, and how can I compare them to their original labels?

Or is there a better way to calculate the accuracy? Thank you.

You could use the loop and push all predictions to the default device (or the device, where label is stored).
Note that, the usage of .data is deprecated and might yield unwanted side effects, so you should call torch.max on the tensor directly.