Hi,

I’m using `DataParallel`

to do multi-gpu training. I need initialize a `torch.zeros`

tensor in my model `forward`

function, which will be added by several logits (calculated by input tensor).The code is as belows:

```
linear_logit = torch.zeros([X.shape[0], 1]).to(self.device)
...
linear_logit += sparse_feat_logit
...
linear_logit += dense_value_logit
```

The parallel part code:

```
model = torch.nn.DataParallel(model, device_ids=[0,1])
```

Then I got this error:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

This is because that the input data and `sparse_feat_logit`

is on cuda:1, but the `linear_logit`

is on cuda:0 (`self.device`

is cuda:0, since the model need to be in cuda:0 before running `torch.nn.DataParallel`

).

I tried to get the current gpu number by input tensors, but I got other errors because some of the input data maybe empty sometimes.

I would like to know how to get the current gpu number without the input tensors. Maybe there is a function or property inside the model? Thank you!