Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! error

Miguelwww · May 10, 2022, 3:26pm

def compute_gradient_penalty(D, X):

"""Calculates the gradient penalty loss for DRAGAN"""

# Random weight term for interpolation

alpha = Tensor(np.random.random(size=X.shape))

interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))

interpolates = Variable(interpolates, requires_grad=True)

d_interpolates = D(interpolates)

fake = Variable(Tensor(X.shape[0], 1).fill_(1.0), requires_grad=False)

# Get gradient w.r.t. interpolates

gradients = autograd.grad(

    outputs=d_interpolates,

    inputs=interpolates,

    grad_outputs=fake,

    create_graph=True,

    retain_graph=True,

    only_inputs=True,

)[0]

gradient_penalty = lambda_gp * ((gradients.norm(2, dim=1) - 1) ** 2).mean()

return gradient_penalty

rahulvigneswaran · May 10, 2022, 3:30pm

Which device are the inputs D, X in ?

If they are in GPU, then you should move alpha to the same device first.

Miguelwww · May 10, 2022, 3:33pm

    cuda = True if torch.cuda.is_available() else False     
    if cuda:
        generator.cuda()
        discriminator.cuda()
        adversarial_loss.cuda() 

   Tensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor
   real_imgs = Variable(imgs.type(Tensor))

Calculate gradient penalty

    gradient_penalty = compute_gradient_penalty(discriminator, real_imgs.data)
    gradient_penalty.backward()

rahulvigneswaran · May 10, 2022, 3:35pm

Move alpha to the same device.

alpha = alpha.cuda()

Miguelwww · May 10, 2022, 3:36pm

alpha = Tensor(np.random.random(size=X.shape))

alpha = alpha.cuda()

interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))

interpolates = Variable(interpolates, requires_grad=True)

Same error.

rahulvigneswaran · May 10, 2022, 3:39pm

Can you .device for interpolates, X, alpha, lambda_gp, fake at the end and show what it prints?

Miguelwww · May 10, 2022, 3:45pm

Random weight term for interpolation

alpha = Tensor(np.random.random(size=X.shape))

alpha = alpha.cuda()

print("X ", X)

The output is:
X tensor([[[[-0.2314, 0.0980, 0.2157, …, 0.1137, 0.3098, 0.3176],
[-0.1608, 0.2549, 0.3020, …, 0.0588, 0.2078, 0.2392],
[ 0.0745, 0.3490, 0.3333, …, 0.0275, 0.1765, 0.3098],
…,
[ 0.1843, 0.2235, 0.0667, …, 0.0510, 0.0902, 0.0431],
[ 0.0902, 0.0745, 0.0196, …, 0.0431, 0.0118, -0.1216],
[ 0.0431, 0.0667, 0.1451, …, 0.2471, 0.2549, 0.1451]]],

    [[[ 0.0118,  0.0431,  0.0745,  ..., -0.0118, -0.0118, -0.0118],
      [ 0.0196,  0.0510,  0.0431,  ...,  0.0353, -0.0039,  0.0196],
      [ 0.0275,  0.0431,  0.0196,  ...,  0.0431,  0.0353,  0.0275],
      ...,
      [ 0.0353,  0.0196,  0.0039,  ..., -0.0353, -0.0039, -0.0118],
      [-0.0510, -0.0588, -0.0902,  ...,  0.0039,  0.0275,  0.0588],
      [-0.0902, -0.0980, -0.1137,  ...,  0.0196,  0.0431,  0.0588]]],


    [[[ 0.1137,  0.0824,  0.1059,  ...,  0.0745,  0.0824,  0.1137],
      [ 0.0510,  0.0902,  0.1137,  ...,  0.1686,  0.1843,  0.1608],
      [ 0.0667,  0.1294,  0.0667,  ...,  0.1373,  0.1294,  0.1059],
      ...,
      [ 0.0745,  0.0431,  0.0353,  ...,  0.1294,  0.1765,  0.1922],
      [ 0.0196,  0.0039,  0.0196,  ...,  0.1608,  0.1765,  0.1922],
      [ 0.0745,  0.1137,  0.0902,  ...,  0.1608,  0.1451,  0.1529]]],


    ...,


    [[[ 0.2078,  0.2314,  0.2471,  ...,  0.2078,  0.2157,  0.2314],
      [ 0.2235,  0.2314,  0.2627,  ...,  0.1216,  0.1294,  0.0667],
      [ 0.2235,  0.2157,  0.2392,  ...,  0.0745,  0.0902,  0.1216],
      ...,
      [ 0.1373,  0.1373,  0.1137,  ...,  0.1608,  0.1765,  0.1608],
      [ 0.0588,  0.0275,  0.0196,  ...,  0.2078,  0.2078,  0.2000],
      [ 0.0745,  0.1137,  0.1294,  ...,  0.2000,  0.2157,  0.2314]]],


    [[[ 0.4196,  0.3098,  0.1137,  ...,  0.1529,  0.2157,  0.2471],
      [ 0.4039,  0.3098,  0.1059,  ...,  0.1608,  0.2549,  0.2784],
      [ 0.3647,  0.2941,  0.1294,  ...,  0.1843,  0.2549,  0.2706],
      ...,
      [ 0.3490,  0.4196,  0.4039,  ...,  0.4118,  0.3569,  0.3333],
      [ 0.3882,  0.4118,  0.3804,  ...,  0.4196,  0.3804,  0.3412],
      [ 0.4196,  0.3961,  0.3804,  ...,  0.4196,  0.3882,  0.3255]]],


    [[[ 0.1922,  0.1765,  0.1451,  ...,  0.1294,  0.2784,  0.3412],
      [ 0.2941,  0.2863,  0.2706,  ...,  0.1059,  0.2941,  0.3569],
      [ 0.3098,  0.2941,  0.3020,  ...,  0.0745,  0.1216,  0.1922],
      ...,
      [ 0.0588,  0.0353,  0.1686,  ...,  0.0588,  0.1922,  0.1765],
      [ 0.0353, -0.0039,  0.0353,  ...,  0.0745,  0.1059,  0.0824],
      [ 0.1608,  0.1216,  0.0510,  ...,  0.1451,  0.1373,  0.0667]]]],
   device='cuda:0')

alpha tensor([[[[0.3812, 0.8624, 0.7040, …, 0.9979, 0.3402, 0.3827],
[0.3474, 0.6531, 0.5925, …, 0.6165, 0.2507, 0.7213],
[0.8703, 0.4955, 0.9952, …, 0.9588, 0.0797, 0.2977],
…,
[0.0519, 0.5157, 0.7973, …, 0.9310, 0.6192, 0.7798],
[0.8278, 0.3567, 0.9265, …, 0.6293, 0.8221, 0.6945],
[0.4674, 0.0350, 0.0032, …, 0.6781, 0.1274, 0.1782]]],

    [[[0.5527, 0.1645, 0.1896,  ..., 0.7192, 0.1765, 0.8997],
      [0.3381, 0.5006, 0.5216,  ..., 0.1386, 0.2940, 0.2124],
      [0.0522, 0.0788, 0.5180,  ..., 0.3680, 0.7478, 0.8783],
      ...,
      [0.4037, 0.3928, 0.2881,  ..., 0.1331, 0.5598, 0.2404],
      [0.3291, 0.1470, 0.5871,  ..., 0.3658, 0.9136, 0.7088],
      [0.4804, 0.2486, 0.9075,  ..., 0.7257, 0.3983, 0.0297]]],


    [[[0.1584, 0.1374, 0.2812,  ..., 0.5362, 0.8423, 0.9169],
      [0.9268, 0.4764, 0.0057,  ..., 0.0157, 0.5602, 0.3994],
      [0.2443, 0.1408, 0.8351,  ..., 0.0033, 0.7463, 0.7172],
      ...,
      [0.4969, 0.5833, 0.7443,  ..., 0.8887, 0.4303, 0.5720],
      [0.5244, 0.4705, 0.9060,  ..., 0.2612, 0.6273, 0.7193],
      [0.0140, 0.3500, 0.0166,  ..., 0.8386, 0.6376, 0.3062]]],


    ...,


    [[[0.0487, 0.6701, 0.6398,  ..., 0.3734, 0.9392, 0.2372],
      [0.4167, 0.4142, 0.9122,  ..., 0.4999, 0.2736, 0.1412],
      [0.8596, 0.9482, 0.4058,  ..., 0.5347, 0.3518, 0.2112],
      ...,
      [0.9427, 0.5980, 0.2776,  ..., 0.3586, 0.0116, 0.3897],
      [0.6982, 0.8484, 0.0603,  ..., 0.2687, 0.6425, 0.4971],
      [0.2935, 0.8600, 0.5418,  ..., 0.4609, 0.9929, 0.8995]]],


    [[[0.5574, 0.6402, 0.2937,  ..., 0.4163, 0.3299, 0.5989],
      [0.7338, 0.4624, 0.8349,  ..., 0.7911, 0.6597, 0.0016],
      [0.4880, 0.3515, 0.3123,  ..., 0.6148, 0.7959, 0.9120],
      ...,
      [0.0232, 0.1757, 0.2756,  ..., 0.1526, 0.9084, 0.7438],
      [0.0200, 0.2719, 0.5503,  ..., 0.6192, 0.9863, 0.6379],
      [0.8273, 0.5738, 0.7902,  ..., 0.4714, 0.3506, 0.7571]]],


    [[[0.7330, 0.2260, 0.1242,  ..., 0.3046, 0.2215, 0.1959],
      [0.9643, 0.8662, 0.4990,  ..., 0.6249, 0.5383, 0.3990],
      [0.3291, 0.2416, 0.6826,  ..., 0.0722, 0.0180, 0.0977],
      ...,
      [0.0146, 0.0082, 0.2837,  ..., 0.2909, 0.5631, 0.3720],
      [0.8499, 0.1797, 0.0546,  ..., 0.7608, 0.5826, 0.2048],
      [0.1851, 0.2435, 0.6196,  ..., 0.0663, 0.7251, 0.0466]]]],
   device='cuda:0')

Traceback (most recent call last):
File “/home/DRAGAN/dragan.py”, line 210, in
gradient_penalty = compute_gradient_penalty(discriminator, real_imgs.data)
File “/home/DRAGAN/dragan.py”, line 144, in compute_gradient_penalty
interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

print("alpha ", alpha)

interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))

print("interpolates ", interpolates)

interpolates = Variable(interpolates, requires_grad=True)

Miguelwww · May 10, 2022, 3:51pm

Found the error:
interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size()).cuda()))
torch.rand was on the cpu instead of cuda. It executes but i don’t know if the fix is correct.

rahulvigneswaran · May 11, 2022, 9:30am

The fix seems to make sense as when you create a new variable its always on CPU unless you move it manually.

The reason why get the error even before printing the interpolates is because X.std() is on GPU while torch.rand() is on CPU. It throws error when they both get multiplied!