# How to use multiple GPUs in customized network layers?

I built a network using customized layers. It runs fine on a single GPU but crashes when using two GPUs of a server. The codes and error messages are shown below. It seems that one of the tensors was split into the 2 GPUs while the other was not. Was it caused by the customized forward function? How should I solve it? Thanks!

Code:

``````os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'

class Cov1(layers):

def __init__(self, in_dim=Fdim, out_dim=Fdim, bias=True):
super(Cov1, self).__init__(in_dim, out_dim, bias)

def forward(self,seq,sum_idx):
simcov1 = torch.zeros(seq.shape).cuda()

for i in range(0,self.in_dim):
SeqDist = Vsets(seq[:,i].unsqueeze(1))
simcov1[:, i] = torch.mean(SeqDist * sum_idx, 1)

simcov1 = 1 - simcov1
if self.bias is not None:
mean_dist = simcov1.matmul(self.weight) + self.bias
return (mean_dist)
else:
mean_dist = simcov1.matmul(self.weight)
return (mean_dist)
``````

Error screen:

If you are using `DataParallel` the assumption is all input tensors have the same dimension along the first (batch) dimension. Otherwise the splitting behavior becomes tricky to reason about.

What are the input shapes (and the meaning of the dimensions) being passed and is `DataParallel` being used?

Thanks eqy. Yes, DataParallel is used for the model: `model = nn.DataParallel(model).cuda()`. And the dimension of the first tensor `SeqDist` is 446x446 and the second tensor `sum_idx` is 446. The multiplication `SeqDist * sum_idx` is to select the rows specified in `sum_idx` and calculate each row’s average.
If the first dimension of the tensors changes, I don’t know how to make the multiplication work…

In this case, can you simply make this data parallel by doing something like making seqdist (N, 446, 446) and sum_idx (N, 446)?

It works~ Thanks a lot~ I revised my coding to meet the splitting mechanism.