Hello everyone, I’m new to PyTorch and am having some issues with a simple LSTM.
What I am trying to do is create an LSTM that will accept a 2-element tensor during inference, update its internal state, and return a single number (or a 1-element tensor). During inference, I will never be providing sequences, only data one element at a time. I assume this is a fairly common use case.
However, I am using sequences for training. I am providing an x tensor of the format (20, 1, 2), where 20 is the sequence length, 1 is the batch size, and 2 is the 2-element input tensor. y is similarly (20, 1, 1) because the output is 1 single element. What I expect to happen is that it will train one element at a time.
Is this a correct assumption? The other possibility is that you train with an entire sequence as input and the single next element as output, but that does not really work since the inputs and outputs are completely different data types. My use case is not like language translation, in which many inputs are translated into many different outputs, nor is it like predicting the next single letter in a sentence. Rather, I am using sequences of line equations (two elements of the form y=mx+b) as x, and using corresponding sequences of steering angles (for a self-driving car) as y. I may be doing this in the wrong way, please correct me as I am new to PyTorch and LSTMs (but not to machine learning).
I have attempted to fairly simply implement the above idea as follows, in Python 2. (The data loading code is obviously different in my real code, but the essence is the same).
import torch
from torch import autograd, nn, optim
# Create random training data
x = autograd.Variable(torch.randn(20, 1, 2))
y = autograd.Variable(torch.randn(20, 1, 1))
# Create a model and use the mean squared error loss function with the Adadelta optimizer
model = nn.Sequential(
nn.LSTM(input_size=2, hidden_size=10, num_layers=1),
nn.ReLU(),
nn.Linear(10, 4),
nn.ReLU(),
nn.Linear(4, 1)
)
loss_function = nn.MSELoss()
optimizer = optim.Adadelta(model.parameters())
# Train the network one epoch at a time
for epoch in range(100):
# Compute the predictions by passing the entire training sequence to the network
predictions = model(x)
# Compute and print the loss using the predictions
loss = loss_function(predictions, y)
# Zero the gradients for the variables that will be updated
optimizer.zero_grad()
# Run backpropagation, calculating gradients for each of the trainable parameters
loss.backward()
# Update the parameters using the optimizer
optimizer.step()
In any case, when I attempt to run the above code, I get a completely undocumented error message:
Traceback (most recent call last):
File "example_script.py", line 24, in <module>
predictions = model(x)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/site-packages/torch/nn/modules/activation.py", line 43, in forward
return F.threshold(input, self.threshold, self.value, self.inplace)
RuntimeError: threshold(): argument 'input' (position 1) must be Variable, not tuple
Hopefully I am doing something stupid that can be easily corrected. Thank you in advance for your help. I would be glad to provide more information or clarify if necessary!