My model was:
def forward(self, x, hidden=None):
lstm_out, hidden = self.lstm(x, hidden)
lstm_out = (lstm_out[:, :, :self.hidden_size] +
lstm_out[:, :, self.hidden_size:])
out = torch.nn.SELU()(lstm_out)
return out, hidden
It now is:
def forward(self, x, hidden=None):
lstm_out, hidden = self.lstm(x, hidden)
batch_size = lstm_out.size(0)
flattened_out = lstm_out.view(-1, self.hidden_size * 2)
lstm_out = (lstm_out[:, :, :self.hidden_size] +
lstm_out[:, :, self.hidden_size:])
out = self.linear(flattened_out)
out = torch.nn.functional.relu(out)
view_out = out.view(batch_size, self.seq_length, -1)
return view_out, hidden
I used to get validation loss (with MSELoss) under 1000
after 2-3 epochs. Now with the Linear layer, it is skyrocketing up to 15000
even after 10 epochs. Why would this be?