RuntimeError: CopyBackwards returned invalid gradient

mayank127 · September 22, 2018, 2:55pm

I was creating a custom module and was getting the error in title when trying to back-propagate after moving the module instance to cuda().

A simpler version looks like this:

class LinearT(Module):
    def __init__(self, in_features, out_features, bias=True, transpose_flag=False):
        super(Linear, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.weight_manifold = weight_manifold
        self.transpose_flag = transpose_flag

        self._weight = Parameter(torch.Tensor(out_features, in_features))
        if self.transpose_flag:
            self.register_buffer('weight', self._weight.transpose(-2, -1))
        else:
            self.register_buffer('weight', self._weight)


        if bias:
            self.bias = Parameter(torch.Tensor(out_features))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()


    def forward(self, input):
        return F.linear(input, self.weight, self.bias)

Error encountered is

RuntimeError: Function torch::autograd::CopyBackwards returned an invalid gradient at index 1 - expected type torch.cuda.FloatTensor but got torch.FloatTensor

I’m creating a ._weight parameter from some other function call and is complex then the above example, but is close enough.

I have used register_bufffer on ‘weight’ to make sure it gets transferred to cuda on _apply function, and did not make it a parameter as the optimizer would be optimizing _weight.

Let me know if anyone has some idea how to make this work.

wangxiao5791509 · March 12, 2019, 1:03pm

“expected type torch.cuda.FloatTensor but got torch.FloatTensor”.
Maybe you should change the data into the gpu with xxx.cuda().