I am trying to create an LSTM model that to predict some time series data.
My data is configured as follows:
Input instances:
[0.99, 0.98, 1.01, 1.03, 1.001, 0.98, 1.001]
Target Value:
[0.995]
So each instance is a vector of seven values used to predict the target.
I am trying to keep my model simple to start so here are the components"
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.lstm1 = nn.LSTM(input_size=7, hidden_size=51, num_layers=1)
self.linear = nn.Linear(in_features=51, out_features=1)
def forward(self, input, future = 0):
outputs = []
h_t = torch.rand(input.size(1)*1, 1, 51, dtype=torch.double)
c_t = torch.rand(input.size(1)*1, 1, 51, dtype=torch.double)
for i, input_t in enumerate(input.chunk(input.size(1), dim=1)):
h_t, c_t = self.lstm1(input_t, (h_t, c_t))
output = self.linear(h_t)
output = output.add(1e-8)
outputs += [output]
for i in range(future):
h_t, c_t = self.lstm1(output, (h_t, c_t))
output = self.linear(h_t)
outputs += [output]
outputs = torch.stack(outputs, 1).squeeze(2)
return outputs
This is called from the following:
...
traindataloader = DataLoader(train_data,
batch_size=500,
shuffle=True,
num_workers=4)
model = MyModel().double()
criterion = nn.MSELoss(reduction='sum') # I have also tried 'none' and 'elementwise_mean'
optimizer = optim.Adam(model.parameters(), lr=1e-06, weight_decay=0.1)
for i in range(10):
optimizer.zero_grad()
print(f"{Fore.BLUE}STEP: ", i, f"{Style.RESET_ALL}")
inputs = None
targets = None
for idx, data in enumerate(traindataloader):
def closure():
y_pred = model(inputs)
loss = criterion(y_pred, targets)
print('Loss:', loss.item())
loss.backward()
return loss
inputs, targets = data
optimizer.step(closure)
Using this code, I can get through one mini batch of inputs before the loss goes to inf
and then to nan
.
My questions are:
- Is my data structured correctly for an LSTM?
- Is my model correct / appropriate?
- Is the DataLoader used correctly? (It seems to be)
From my research, it appears I have an exploding gradient, and I have tried several ways to resolve this, but I cannot find the right combination.
I am running this on PyTorch 0.41/Python 3.6
Any help would be greatly appreciated.