Hi. I’m trying to understand something…
In the word_language_model example, the network is trained on “data” sequences which are args.bptt long, which are by default 20 words long (in batches which are also 20 by default):
output, hidden = model(data, hidden)
And then in the generate.py, you load the same model via the checkpoint file, but then the starting “input” is only one word long:
input = Variable(torch.rand(1, 1).mul(ntokens).long(), volatile=True)
and then you predict a new word via…
output, hidden = model(input, hidden)
How is this possible? If the model is expecting 20 inputs, shouldn’t it produce an error when you try to send it only 1?
Furthermore, when I try to actually send the generation code a sequence of length 20 by creating…
input = corpus.test[0:20]
print("input =",input)
Then I get…
('input = ',
142
78
54
251
2360
405
24
315
706
32
101
934
935
936
874
251
572
5564
2680
34
[torch.LongTensor of size 20]
)
Traceback (most recent call last):
File “generate.py”, line 85, in
output, hidden = model(input, hidden)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py”, line 210, in call
result = self.forward(*input, **kwargs)
File “/home/mcskwayrd/neural/torch/pytorch/examples/word_language_model/model.py”, line 27, in forward
emb = self.encoder(input)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py”, line 210, in call
result = self.forward(*input, **kwargs)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/sparse.py”, line 94, in forward
)(input, self.weight)
RuntimeError: expected a Variable argument, but got LongTensor
And if instead I use the get_batch() method, as it was used in main.py…
corpus = data.Corpus(args.data)
ntokens = len(corpus.dictionary)
hidden = model.init_hidden(1)
def batchify(data, bsz): # breaks into parallel streams
nbatch = data.size(0) // bsz
data = data.narrow(0, 0, nbatch * bsz)
data = data.view(bsz, -1).t().contiguous()
if args.cuda:
data = data.cuda()
return data
eval_batch_size = 10
test_data = batchify(corpus.test, eval_batch_size)
def get_batch(source, i, evaluation=False):
bptt = 20
seq_len = min(bptt, len(source) - 1 - i)
data = Variable(source[i:i+seq_len], volatile=evaluation)
target = Variable(source[i+1:i+1+seq_len].view(-1))
return data, target
#input = Variable(torch.rand(1, 1).mul(ntokens).long(), volatile=True)
input, targets = get_batch(test_data, 0, evaluation=True)
Then I when I get to the prediction step (i.e., " output, hidden = model(input, hidden)" ), I get the error…
Traceback (most recent call last):
File “generate.py”, line 96, in
output, hidden = model(input, hidden)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py”, line 210, in call
result = self.forward(*input, **kwargs)
File “/home/mcskwayrd/neural/torch/pytorch/examples/word_language_model/model.py”, line 28, in forward
output, hidden = self.rnn(emb, hidden)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py”, line 210, in call
result = self.forward(*input, **kwargs)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py”, line 79, in forward
return func(input, self.all_weights, hx)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py”, line 228, in forward
return func(input, *fargs, **fkwargs)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py”, line 138, in forward
nexth, output = func(input, hidden, weight)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py”, line 67, in forward
hy, output = inner(input, hidden[l], weight[l])
File “/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py”, line 96, in forward
hidden = inner(input[i], hidden, *weight)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py”, line 22, in LSTMCell
gates = F.linear(input, w_ih, b_ih) + F.linear(hx, w_hh, b_hh)
File “/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py”, line 748, in add
return self.add(other)
File “/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py”, line 288, in add
return self._add(other, False)
File “/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py”, line 282, in _add
return Add(inplace)(self, other)
File “/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/basic_ops.py”, line 13, in forward
return a.add(b)
RuntimeError: inconsistent tensor size at /home/soumith/local/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:601
Confused: How does sending 20-word sequences work in main.py but fail in generate.py?
PS- I see the documentation for torch.nn.RNN says input is supposed to be a Tensor, but that’s just what I’m sending. It didn’t say anything about needing a Variable or other “matrix”:
input (seq_len, batch, input_size): tensor containing the features of the input sequence.
Thanks!