I was creating a custom module and was getting the error in title when trying to back-propagate after moving the module instance to cuda().
A simpler version looks like this:
class LinearT(Module):
def __init__(self, in_features, out_features, bias=True, transpose_flag=False):
super(Linear, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight_manifold = weight_manifold
self.transpose_flag = transpose_flag
self._weight = Parameter(torch.Tensor(out_features, in_features))
if self.transpose_flag:
self.register_buffer('weight', self._weight.transpose(-2, -1))
else:
self.register_buffer('weight', self._weight)
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def forward(self, input):
return F.linear(input, self.weight, self.bias)
Error encountered is
RuntimeError: Function torch::autograd::CopyBackwards returned an invalid gradient at index 1 - expected type torch.cuda.FloatTensor but got torch.FloatTensor
I’m creating a ._weight parameter from some other function call and is complex then the above example, but is close enough.
I have used register_bufffer on ‘weight’ to make sure it gets transferred to cuda on _apply function, and did not make it a parameter as the optimizer would be optimizing _weight.
Let me know if anyone has some idea how to make this work.