Input Shape for LSTM

I have a LSTM defined in PyTorch as: = nn.LSTM(input_size=101, hidden_size=4, batch_first=True)

I then have a deque object of length 4, full of a history of states (each a 1D tensor of size 101) from the environment. I reshape this and pass it to my agent:


so that it has shape [1,4,101].

Essentially I have an environment which I believe that, given more history, agents should be able to predict what will happen next better. I am trying to use an LSTM to acheive this. The problem is, when I pass this through my model:

def forward(self, x):
        logits =

I get the error:
RuntimeError: input.size(-1) must be equal to input_size. Expected 404, got 101

Why is this happening? As I understand it, the LSTM should expect a sequence of observations of size 101 to be passed to it, not a tensor of shape [hidden_size*obs_size]. Otherwise how does it figure out how big a feature is?

Could you provide a reproducible bit of code that raises that RuntimeError? When I try to reproduce your code, it runs fine for me, so I suspect the input you’re passing it isn’t actually shaped like (1, 4, 101).

import torch
from torch import nn
actor = nn.LSTM(input_size=101, hidden_size=4, batch_first=True)
x = torch.randn(1, 4, 101)

(tensor([[[ 1.7885e-05,  3.4918e-01,  3.8907e-01,  2.3808e-01],
          [ 4.0551e-02,  6.3810e-02,  6.0021e-01,  2.8658e-03],
          [ 1.1655e-01,  2.2330e-01, -2.9728e-02,  1.0425e-02],
          [ 5.7080e-02, -3.9743e-01, -5.7700e-01,  7.0268e-02]]],
 (tensor([[[ 0.0571, -0.3974, -0.5770,  0.0703]]], grad_fn=<StackBackward0>),
  tensor([[[ 0.9932, -0.4577, -1.0284,  0.1895]]], grad_fn=<StackBackward0>)))

@Andrei_Cristea I will try to get a minimal example working, but bear in mind that I get the same error when I run it like this:

class ActorNet(nn.Module):
    def __init__(self, obs_size, n_actions, depth, hidden_size = 64):

        print(obs_size) = nn.Sequential(nn.LSTM(input_size=obs_size, hidden_size=4, batch_first=True))

    def forward(self, x):
        x = torch.randn(1,4,101)
        logits =
        logits = torch.nan_to_num(logits)
        dist = Categorical(logits=logits)
        action = dist.sample()

        return dist, action

Not to worry, there was a very silly mistake. You can delete this post now :grinning: