RuntimeError of Scatter function on GPU but not on CPU

A line of code: Hx.scatter(1, indices, kmatrix) in my project threw the error: “RuntimeError: invalid argument 3: Index tensor must have the same size as input tensor”. The error is invoked only on GPU, but not CPU. This is weird, especially it seems to relate to the variable size, which should not differ on different devices.

What’s the reason for this?

Hi,

Could you give a small code sample to reproduce this issue please?

Hi, @albanD

The code below is an example.

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.parameter as Parameter
from torch.autograd import Variable


BS, C = 256, 8
sn = torch.randn(1)[0]
gate_rand = Variable(torch.rand(BS, C).cuda(), requires_grad=True)
noise_rand = Variable(torch.rand(BS, C).cuda(), requires_grad=True)

gate = gate_rand + 1.0e-3
noise = noise_rand + 1.0e-4
Hx = gate + sn*noise
print(gate.requires_grad, noise.requires_grad)

topk, indices = torch.topk(Hx, C//2)
_, neg_indices = torch.topk(-Hx, C//2)
topk1, indices1 = torch.topk(Hx, C//2+1)
thresh_k = topk.min(1, keepdim=True)[0]
thresh_k1 = topk1.min(1, keepdim=True)[0]

Hx_mask = Hx.scatter(1, neg_indices, float('-inf'))

kmatrix = thresh_k.repeat(1, C)
kmatrix1 = thresh_k1.repeat(1, C)
print(kmatrix.size(), kmatrix1.size(), Hx.size(), indices.size())
kth_excluding = kmatrix.scatter(1, indices, kmatrix1)
print('Hx: ', Hx_mask.size())
print('kth_excluding: ', kth_excluding.size())

Using the current master branch, I get the following output for your code.

$ python sample.py 
(True, True)
((256, 8), (256, 8), (256, 8), (256, 4))
('Hx: ', (256, 8))
('kth_excluding: ', (256, 8))

This code crashes on your machine? What version of pytorch are you using?

I used pytorch-0.2.0_4, and it threw RumtimeError.

Now the problem is solved after upgrading to 0.4.0, as you pointed out. Thank you

I have the same problem using pytorch 0.4.1

From the master branch, this still runs fine for me. What error are you getting?

The function producing the error is the following from the Ignite library:

def to_onehot(indices, num_classes):
    onehot = torch.zeros(indices.size(0), num_classes, device=indices.device)
    return onehot.scatter_(1, indices.unsqueeze(1), 1)

The error I get is:

RuntimeError: invalid argument 3: Index tensor must have same dimensions as output tensor at c:\programdata\miniconda3\conda-bld\pytorch_1533096106539\work\aten\src\thc\generic/THCTensorScatterGather.cu:295

So this is not the same error as this one?
Anyway, most certainly a problem where you have too many dimensions? indices should be of size batch, not batch x 1.

Ah yes the error is the same (at least the same kind). Right now “indices” has shape [32x112x112]. It doesn’t change the error even if I remove the “indices.unsqueeze(1)” and instead just use indices instead. Changing the index dimension from 1 to 0 doesn’t help either.

From the code that you sent, indices should be a 1D tensor and num_classes an int.

hmm the Pytorch documentation seems to suggest that it doesn’t need to be 1d (just the same shape as the tensor it operates on): doc link

Yes,
But the to_onehot function that you send call scatter on a 2D tensor that it creates and add one dimension to indices before giving it to scatter. So the original indices tensor given to to_onehot should be 1D.

Ah I see. Thanks! I will try to do something else then. This is part of the official pytorch Ignite library btw - that’s why I was quite sure it should have been working.

I got it working now - guess I ultimately was just confused by the documentation. The output measured should be a 1xN tensor for it to work. Thanks for your help!