Hello! I am learning about LSTM networks and now I am training in many-to-one approach. Here is how my forward function looks like:

```
def forward(self, x):
out0 = None
for i in range(x.shape[1]):
inp = x[:, i, :]
out0 = torch.relu(self.inp_fc(inp))
out0 = self.lstm(out0)
out = torch.tanh(self.dense(out0))
return out
```

x is `[1, samples_count, features_count]`

. So I am processing each sample one at a time sequentially, using `for`

loop. Once forward functions is done, I am doing a backward propagation using a loss function.

So my question is - Does the fact that I do not store `out0`

in for loop impact backward propagation in any way? I think I mightâ€™ve asked a stupid question, but I am still struggling to understand where exactly gradients are stored