I read Thomas Woolf’s article on balanced loads when using multiple GPUs, and I would like to adapt it for my training. He mentioned that when using
DataParallelModel and compared to
torch.nn.DataParallel, the predictions in the forward pass (
predictions = parallel_model(inputs)) would be a tuple of n tensors, with each tensor being located on a specific GPU (There are n GPUs used for training).
Here is the code for his implementation. To recap, I simply wrap the model like this:
parallel_model = DataParallelModel(model) predictions =parallel_model(inputs)
This will affect how I currently compute the accuracy because I use torch.max to get the prediction from the tensors like this:
_, pred = torch.max(predictions.data, dim=1) correct += (pred == label).sum().item() total += label.size(0) std_acc= (correct/total) * 100
I am thinking this can be solved by iterating through the tuple
predictions like this:
for i in range(len(gpu_list)): _, pred = torch.max(predictions[i].data, dim=1) correct += (pred == label).sum().item() total += label.size(0) std_acc= (correct/total) * 100
However, each tensor is located on a different GPU, so iterating through it doesn’t seem to make sense. How can I access the
pred for each tensor in
predictions given that they are in different GPUs, and how can I compare them to their original labels?
Or is there a better way to calculate the accuracy? Thank you.