Error with LSTM when switching from 0.4 to 1.0 (invalid combination of arguments)


(Alexis W) #1

My code was working with 0.4, but later when switching to 1.0, the following bug occurs. Is there any quick fix here?

TypeError: lstm() received an invalid combination of arguments - got (Tensor, Tensor, tuple, list, bool, int, float, bool, int), but expected one of:

  • (Tensor data, Tensor batch_sizes, tuple of Tensors hx, tuple of Tensors params, bool has_biases, int num_layers, float dropout, bool train, bool bidirectional)
    didn’t match because some of the arguments have invalid types: (Tensor, Tensor, tuple, list, bool, int, float, bool, int)
  • (Tensor input, tuple of Tensors hx, tuple of Tensors params, bool has_biases, int num_layers, float dropout, bool train, bool bidirectional, bool batch_first)
    didn’t match because some of the arguments have invalid types: (Tensor, Tensor, tuple, list, bool, int, float, bool, int)

(Alexis W) #2

Still need some insight on this :grinning:


(Alessio Rosatelli) #3

I got the same problem today. Is there any quick fix?


#4

It seems you are passing list for the 4th argument (tuple expected) and an int for the last argument (bool expected).
Could you check these and let me know, if you could fix it?
If not, could you post a (small) executable code snippet so that we could debug easier?


(Petteri Nevavuori) #5

Hi @AlexisW!

I faced a similar kind of problem. In my case I had to go through my self-implemented initialization function, that essentially assigns random values to several hyperparameters. The key culprit was in value types. I had to transform all instances of numpy variables to original Python types by explicit casts.

Concrete example would be changing

bidirectional = numpy.random.randint(0,1)

to

bidirectional = bool(numpy.random.randint(0,1))

While this might not be directly the remedy to your situation, I highly recommend pasting the error message and trying to line up the expected parameters with provided parameters. That I can also say already that the first set of expected types has to do with PackedSequence inputs, so I suspect the latter would be the one you should be looking at.

Actually, let me do that for you right now:

(Tensor input,  tuple of Tensors hx,    tuple of Tensors params, 
(Tensor,        Tensor,                 tuple,                  

bool has_biases,    int num_layers, float dropout,  
list,               bool,           int,            

bool train, bool bidirectional, bool batch_first)
float,      bool,               int)

Its much more easier to spot where the discrepancies might lie in terms of expected and provided parameter types. While native types should be casted to corresponding types on-the-fly, that has_biases seems not to work. I’d suggest you check out how you’re initializing your model at the moment.

Cheers,
PN


(Bill Collins) #6

I’m getting this error too. I would love to see a solution.

Code snippet (I’m wrapping the model in nn.DataParallel):

class SimpleLSTM(nn.Module):
def init(self, vocab_size, emb_dim, pad_idx, hid_dim, lstm_layers):
super(SimpleLSTM, self).init()
self.hid_dim = hid_dim #128
self.lstm_layers = lstm_layers #2
self.embeddings = nn.Embedding(vocab_size, emb_dim, padding_idx=pad_idx)
self.lstm = nn.LSTM(emb_dim, hid_dim, lstm_layers)

def forward(self, inputs, targets, input_mask, target_mask):
    #batch size is dynamic
    batch_size = inputs.size()[1]   #inputs = seq_len x batch
    emb_inputs = self.embeddings(inputs)  # seq_len x batch x emb_dim
    lstm_hid_state = (torch.zeros(self.lstm_layers, batch_size, self.hid_dim),
                      torch.zeros(self.lstm_layers, batch_size, self.hid_dim))
    lstm_out, lstm_hid_state = self.lstm(emb_inputs, lstm_hid_state)
    return lstm_out[-1] 

Error msg (truncated):

in forward(self, inputs, targets, input_mask, target_mask)
45 lstm_hid_state = (torch.zeros(self.lstm_layers, batch_size, self.hid_dim),
46 torch.zeros(self.lstm_layers, batch_size, self.hid_dim))
—> 47 lstm_out, lstm_hid_state = self.lstm(emb_inputs, lstm_hid_state)
48 print(‘last lstm out’, lstm_out[-1].size())
49 #lstm_out, recovered_lengths = nn.utils.rnn.pad_packed_sequence(lstm_out)

~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
–> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
177 if batch_sizes is None:
178 result = _impl(input, hx, self._flat_weights, self.bias, self.num_layers,
–> 179 self.dropout, self.training, self.bidirectional, self.batch_first)
180 else:
181 result = _impl(input, batch_sizes, hx, self._flat_weights, self.bias,

TypeError: lstm() received an invalid combination of arguments - got (Tensor, tuple, list, float, int, int, bool, bool, bool), but expected one of:

  • (Tensor data, Tensor batch_sizes, tuple of Tensors hx, tuple of Tensors params, bool has_biases, int num_layers, float dropout, bool train, bool bidirectional)
    didn’t match because some of the arguments have invalid types: (Tensor, tuple, list, float, int, int, bool, bool, bool)
  • (Tensor input, tuple of Tensors hx, tuple of Tensors params, bool has_biases, int num_layers, float dropout, bool train, bool bidirectional, bool batch_first)
    didn’t match because some of the arguments have invalid types: (Tensor, tuple, list, float, int, int, bool, bool, bool)

Thanks!


(Alexis W) #7

This is the correct solution. There has been issues submitted to pytorch a while ago but they did not seem to have a chance to get to that.