DataParallel to do multi-gpu training. I need initialize a
torch.zeros tensor in my model
forward function, which will be added by several logits (calculated by input tensor).The code is as belows:
linear_logit = torch.zeros([X.shape, 1]).to(self.device) ... linear_logit += sparse_feat_logit ... linear_logit += dense_value_logit
The parallel part code:
model = torch.nn.DataParallel(model, device_ids=[0,1])
Then I got this error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
This is because that the input data and
sparse_feat_logit is on cuda:1, but the
linear_logit is on cuda:0 (
self.device is cuda:0, since the model need to be in cuda:0 before running
I tried to get the current gpu number by input tensors, but I got other errors because some of the input data maybe empty sometimes.
I would like to know how to get the current gpu number without the input tensors. Maybe there is a function or property inside the model? Thank you!