PyTorch: How to get data from an LSTM

I am new to PyTorch and I am trying to build a reinforcement learning system that uses OpenAI for trying to predict whether or not a stock should be bought or not and at what time.

class NeuronalNetwork(nn.Module):
    def __init__(self, stock_env: StockEnv):
        super(NeuronalNetwork, self).__init__()
        self.stock_env = stock_env
        input_size = len(self.stock_env.normalized_dataframe.columns)
        self.hidden_size = 128
        self.num_layers = 4
        self.kernel = 2
        output_size = self.stock_env.action_space.n
        self.lstm = nn.LSTM(input_size=input_size, hidden_size=self.hidden_size, num_layers=self.num_layers, batch_first=True)
        self.output_layer = nn.Linear(self.hidden_size, output_size)
        self.softmax = nn.LogSoftmax(dim=output_size)
        self.tanh = nn.Tanh()

    def forward(self, x, hidden=None):
        # N x T x D
        # N - the number of windows sizes
        # T - the window size
        # D - the number of indicators and OHLCV in total

        if len(x.shape) > 2:
            batch_size = x.shape[0]
            batch_size = 1

        if hidden is None:
            hidden = (
                torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device),
                torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device),
        D = len(self.stock_env.normalized_dataframe.columns)
        T = self.stock_env.window_size
        N = batch_size
        x = x.view(N, T, D).type(torch.FloatTensor).to(device)

        out, (ht, ct) = self.lstm(x, hidden)
        out = self.tanh(out)
        out = self.output_layer(out)
        return out

My x from forward is representing my data under the form [Number_of_batches x Window_size x Features]

For the moment my out will be the shape Number_of_batches x Window_size x Action but what I want to make my model learn is to predict the best action ONLY for the 250th element. So does anyone know what can I do in order to obtain an out with a shape of (batch_size x action) where the action is going to be the last element from the column window_sie?


out.shape => (batch_size, windows_size, features)

FOR **b** all batch_size:
   batch = [] 
   FOR _ all actions

And on the end, I will have an out that is going to be batch_size x action where the action is only going to be the action of the 250th element from the window_size.

I’m not sure if makes sense for you what is my question, but it doesn’t just let me know and I will try to explain it differently.