Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! error

def compute_gradient_penalty(D, X):

"""Calculates the gradient penalty loss for DRAGAN"""

# Random weight term for interpolation

alpha = Tensor(np.random.random(size=X.shape))

interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))

interpolates = Variable(interpolates, requires_grad=True)

d_interpolates = D(interpolates)

fake = Variable(Tensor(X.shape[0], 1).fill_(1.0), requires_grad=False)

# Get gradient w.r.t. interpolates

gradients = autograd.grad(

    outputs=d_interpolates,

    inputs=interpolates,

    grad_outputs=fake,

    create_graph=True,

    retain_graph=True,

    only_inputs=True,

)[0]

gradient_penalty = lambda_gp * ((gradients.norm(2, dim=1) - 1) ** 2).mean()

return gradient_penalty

Which device are the inputs D, X in ?

If they are in GPU, then you should move alpha to the same device first.

    cuda = True if torch.cuda.is_available() else False     
    if cuda:
        generator.cuda()
        discriminator.cuda()
        adversarial_loss.cuda() 

   Tensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor
   real_imgs = Variable(imgs.type(Tensor))

Calculate gradient penalty

    gradient_penalty = compute_gradient_penalty(discriminator, real_imgs.data)
    gradient_penalty.backward()

Move alpha to the same device.

alpha = alpha.cuda()
alpha = Tensor(np.random.random(size=X.shape))

alpha = alpha.cuda()

interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))

interpolates = Variable(interpolates, requires_grad=True)

Same error.

Can you .device for interpolates, X, alpha, lambda_gp, fake at the end and show what it prints?

Random weight term for interpolation

alpha = Tensor(np.random.random(size=X.shape))

alpha = alpha.cuda()

print("X ", X)

The output is:
X tensor([[[[-0.2314, 0.0980, 0.2157, …, 0.1137, 0.3098, 0.3176],
[-0.1608, 0.2549, 0.3020, …, 0.0588, 0.2078, 0.2392],
[ 0.0745, 0.3490, 0.3333, …, 0.0275, 0.1765, 0.3098],
…,
[ 0.1843, 0.2235, 0.0667, …, 0.0510, 0.0902, 0.0431],
[ 0.0902, 0.0745, 0.0196, …, 0.0431, 0.0118, -0.1216],
[ 0.0431, 0.0667, 0.1451, …, 0.2471, 0.2549, 0.1451]]],

    [[[ 0.0118,  0.0431,  0.0745,  ..., -0.0118, -0.0118, -0.0118],
      [ 0.0196,  0.0510,  0.0431,  ...,  0.0353, -0.0039,  0.0196],
      [ 0.0275,  0.0431,  0.0196,  ...,  0.0431,  0.0353,  0.0275],
      ...,
      [ 0.0353,  0.0196,  0.0039,  ..., -0.0353, -0.0039, -0.0118],
      [-0.0510, -0.0588, -0.0902,  ...,  0.0039,  0.0275,  0.0588],
      [-0.0902, -0.0980, -0.1137,  ...,  0.0196,  0.0431,  0.0588]]],


    [[[ 0.1137,  0.0824,  0.1059,  ...,  0.0745,  0.0824,  0.1137],
      [ 0.0510,  0.0902,  0.1137,  ...,  0.1686,  0.1843,  0.1608],
      [ 0.0667,  0.1294,  0.0667,  ...,  0.1373,  0.1294,  0.1059],
      ...,
      [ 0.0745,  0.0431,  0.0353,  ...,  0.1294,  0.1765,  0.1922],
      [ 0.0196,  0.0039,  0.0196,  ...,  0.1608,  0.1765,  0.1922],
      [ 0.0745,  0.1137,  0.0902,  ...,  0.1608,  0.1451,  0.1529]]],


    ...,


    [[[ 0.2078,  0.2314,  0.2471,  ...,  0.2078,  0.2157,  0.2314],
      [ 0.2235,  0.2314,  0.2627,  ...,  0.1216,  0.1294,  0.0667],
      [ 0.2235,  0.2157,  0.2392,  ...,  0.0745,  0.0902,  0.1216],
      ...,
      [ 0.1373,  0.1373,  0.1137,  ...,  0.1608,  0.1765,  0.1608],
      [ 0.0588,  0.0275,  0.0196,  ...,  0.2078,  0.2078,  0.2000],
      [ 0.0745,  0.1137,  0.1294,  ...,  0.2000,  0.2157,  0.2314]]],


    [[[ 0.4196,  0.3098,  0.1137,  ...,  0.1529,  0.2157,  0.2471],
      [ 0.4039,  0.3098,  0.1059,  ...,  0.1608,  0.2549,  0.2784],
      [ 0.3647,  0.2941,  0.1294,  ...,  0.1843,  0.2549,  0.2706],
      ...,
      [ 0.3490,  0.4196,  0.4039,  ...,  0.4118,  0.3569,  0.3333],
      [ 0.3882,  0.4118,  0.3804,  ...,  0.4196,  0.3804,  0.3412],
      [ 0.4196,  0.3961,  0.3804,  ...,  0.4196,  0.3882,  0.3255]]],


    [[[ 0.1922,  0.1765,  0.1451,  ...,  0.1294,  0.2784,  0.3412],
      [ 0.2941,  0.2863,  0.2706,  ...,  0.1059,  0.2941,  0.3569],
      [ 0.3098,  0.2941,  0.3020,  ...,  0.0745,  0.1216,  0.1922],
      ...,
      [ 0.0588,  0.0353,  0.1686,  ...,  0.0588,  0.1922,  0.1765],
      [ 0.0353, -0.0039,  0.0353,  ...,  0.0745,  0.1059,  0.0824],
      [ 0.1608,  0.1216,  0.0510,  ...,  0.1451,  0.1373,  0.0667]]]],
   device='cuda:0')

alpha tensor([[[[0.3812, 0.8624, 0.7040, …, 0.9979, 0.3402, 0.3827],
[0.3474, 0.6531, 0.5925, …, 0.6165, 0.2507, 0.7213],
[0.8703, 0.4955, 0.9952, …, 0.9588, 0.0797, 0.2977],
…,
[0.0519, 0.5157, 0.7973, …, 0.9310, 0.6192, 0.7798],
[0.8278, 0.3567, 0.9265, …, 0.6293, 0.8221, 0.6945],
[0.4674, 0.0350, 0.0032, …, 0.6781, 0.1274, 0.1782]]],

    [[[0.5527, 0.1645, 0.1896,  ..., 0.7192, 0.1765, 0.8997],
      [0.3381, 0.5006, 0.5216,  ..., 0.1386, 0.2940, 0.2124],
      [0.0522, 0.0788, 0.5180,  ..., 0.3680, 0.7478, 0.8783],
      ...,
      [0.4037, 0.3928, 0.2881,  ..., 0.1331, 0.5598, 0.2404],
      [0.3291, 0.1470, 0.5871,  ..., 0.3658, 0.9136, 0.7088],
      [0.4804, 0.2486, 0.9075,  ..., 0.7257, 0.3983, 0.0297]]],


    [[[0.1584, 0.1374, 0.2812,  ..., 0.5362, 0.8423, 0.9169],
      [0.9268, 0.4764, 0.0057,  ..., 0.0157, 0.5602, 0.3994],
      [0.2443, 0.1408, 0.8351,  ..., 0.0033, 0.7463, 0.7172],
      ...,
      [0.4969, 0.5833, 0.7443,  ..., 0.8887, 0.4303, 0.5720],
      [0.5244, 0.4705, 0.9060,  ..., 0.2612, 0.6273, 0.7193],
      [0.0140, 0.3500, 0.0166,  ..., 0.8386, 0.6376, 0.3062]]],


    ...,


    [[[0.0487, 0.6701, 0.6398,  ..., 0.3734, 0.9392, 0.2372],
      [0.4167, 0.4142, 0.9122,  ..., 0.4999, 0.2736, 0.1412],
      [0.8596, 0.9482, 0.4058,  ..., 0.5347, 0.3518, 0.2112],
      ...,
      [0.9427, 0.5980, 0.2776,  ..., 0.3586, 0.0116, 0.3897],
      [0.6982, 0.8484, 0.0603,  ..., 0.2687, 0.6425, 0.4971],
      [0.2935, 0.8600, 0.5418,  ..., 0.4609, 0.9929, 0.8995]]],


    [[[0.5574, 0.6402, 0.2937,  ..., 0.4163, 0.3299, 0.5989],
      [0.7338, 0.4624, 0.8349,  ..., 0.7911, 0.6597, 0.0016],
      [0.4880, 0.3515, 0.3123,  ..., 0.6148, 0.7959, 0.9120],
      ...,
      [0.0232, 0.1757, 0.2756,  ..., 0.1526, 0.9084, 0.7438],
      [0.0200, 0.2719, 0.5503,  ..., 0.6192, 0.9863, 0.6379],
      [0.8273, 0.5738, 0.7902,  ..., 0.4714, 0.3506, 0.7571]]],


    [[[0.7330, 0.2260, 0.1242,  ..., 0.3046, 0.2215, 0.1959],
      [0.9643, 0.8662, 0.4990,  ..., 0.6249, 0.5383, 0.3990],
      [0.3291, 0.2416, 0.6826,  ..., 0.0722, 0.0180, 0.0977],
      ...,
      [0.0146, 0.0082, 0.2837,  ..., 0.2909, 0.5631, 0.3720],
      [0.8499, 0.1797, 0.0546,  ..., 0.7608, 0.5826, 0.2048],
      [0.1851, 0.2435, 0.6196,  ..., 0.0663, 0.7251, 0.0466]]]],
   device='cuda:0')

Traceback (most recent call last):
File “/home/DRAGAN/dragan.py”, line 210, in
gradient_penalty = compute_gradient_penalty(discriminator, real_imgs.data)
File “/home/DRAGAN/dragan.py”, line 144, in compute_gradient_penalty
interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

print("alpha ", alpha)

interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size())))

print("interpolates ", interpolates)

interpolates = Variable(interpolates, requires_grad=True)

Found the error:
interpolates = alpha * X + ((1 - alpha) * (X + 0.5 * X.std() * torch.rand(X.size()).cuda()))
torch.rand was on the cpu instead of cuda. It executes but i don’t know if the fix is correct.

The fix seems to make sense as when you create a new variable its always on CPU unless you move it manually.

The reason why get the error even before printing the interpolates is because X.std() is on GPU while torch.rand() is on CPU. It throws error when they both get multiplied!