I want to train a model for a time series prediction task. I built my own model on PyTorch but I’m getting really bad performance compared to the same model implemented on Keras. Each epoch on PyTorch takes 50ms against 1ms on Keras. I want to show you my simple code because I’d like to know if I made any mistakes or it’s just PyTorch. Thank you in advance.

This is my module:

```
class Net(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim, hidden_layers):
super(Net, self).__init__()
self.input_dim = input_dim
self.hidden_dim = hidden_dim
self.output_dim = output_dim
self.hidden_layers = hidden_layers
self.lstm = nn.LSTM(input_dim, hidden_dim, hidden_layers)
self.h2o = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
h_t = Variable(torch.randn(self.hidden_layers, BATCH_SIZE, self.hidden_dim)).cuda()
c_t = Variable(torch.randn(self.hidden_layers, BATCH_SIZE, self.hidden_dim)).cuda()
h_t, c_t = self.lstm(x, (h_t, c_t))
output = self.h2o(h_t)
return output
```

And this is the training execution:

```
model = Net(INPUT_DIM, 40, OUTPUT_DIM, 1).cuda()
loss_fcn = MEDLoss()
optimizer = optim.RMSprop(model.parameters())
for epoch in range(EPOCHS):
loss = 0
start = time.time()
for seq in range(11, 20):
length = seq_lenghts[seq]
x = Variable(X_data[:length, [seq], :]).cuda()
y = Variable(Y_data[:length, [seq], :]).cuda()
model.zero_grad()
output = model(x)
loss = loss_fcn(output, y)
loss.backward()
optimizer.step()
print("Epoch", epoch + 1, "Loss:", loss.cpu().data.numpy(), "Time:", time.time() - start)
```