Training RNN on GPU

Hi

I’m training an RNN with LSTM cell. It works when I run the model on CPU, but when I load the model and the data on GPU like:

seq = Sequence()
seq.cuda()
input = Variable(torch.from_numpy(data).cuda())
out = seq(input)

and then I get the error:

TypeError: addmm_ received an invalid combination of arguments - got (int, int, torch.FloatTensor,
torch.cuda.FloatTensor), but expected one of:

  • (torch.FloatTensor mat1, torch.FloatTensor mat2)
  • (torch.SparseFloatTensor mat1, torch.FloatTensor mat2)
  • (float beta, torch.FloatTensor mat1, torch.FloatTensor mat2)
  • (float alpha, torch.FloatTensor mat1, torch.FloatTensor mat2)
  • (float beta, torch.SparseFloatTensor mat1, torch.FloatTensor mat2)
  • (float alpha, torch.SparseFloatTensor mat1, torch.FloatTensor mat2)
  • (float beta, float alpha, torch.FloatTensor mat1, torch.FloatTensor mat2)
    didn’t match because some of the arguments have invalid types: (int, int, torch.FloatTensor, !torch.cuda.FloatTensor!)
  • (float beta, float alpha, torch.SparseFloatTensor mat1, torch.FloatTensor mat2)
    didn’t match because some of the arguments have invalid types: (int, int, !torch.FloatTensor!, !torch.cuda.FloatTensor!)

It seems like the input is not on GPU, but when I output the variable the datatype is

[torch.cuda.DoubleTensor of size 7x6 (GPU 0)]

So how can I fix the problem?

Thank you.

If you implement your own nn.Module, which includes some parameters inside. You should declare it as Parameters to let nn.Module.cuda() transfer the parameters into GPU memory.

Can you give a example? Thank you.

Hi, there is not enough information about Sequence() but I think hidden weight is not cuda.FloatTensor but FloatTensor . It might be the cause of that problem.

Here’s the source code for nn.Module.cuda():
http://pytorch.org/docs/_modules/torch/nn/modules/module.html#Module.cuda

In short, it iterates all the sub-modules and nn.Parameters, and calls the .cuda() method recursively.

To make .cuda() work correctly, you should declare the variable in your Module like that:

self.weight = Parameter(torch.Tensor(out_features, in_features))

Ref:
http://pytorch.org/docs/nn.html?highlight=linear#parameters

Thanks.

The problem is that the Sequence() module includes some variables which are not loaded to GPU, the cell states and hidden states of LSTM. I just try to declare these variable in the Module, but it seems like Variable can not be included in parameter. So what I’ve done is calls .cuda() method for those variables in forward(). Is there a more efficient solution?

1 Like

@zed no that’s the best you can do. only nn.Parameter and registered buffers are typecasted:
http://pytorch.org/docs/nn.html?highlight=register#torch.nn.Module.register_buffer