Hi there,

I need to use a Reinforcement Learning Model where I need to use a RNN model. I wrote the code for the RNN like this:

```
class LSTMModel(nn.Module):
def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, dropout_prob, state_size):
super(LSTMModel, self).__init__()
# Defining the number of layers and the nodes in each layer
self.hidden_dim = hidden_dim
self.layer_dim = layer_dim
........
def forward(self):
.......
return ....
rnn = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim, dropout, state_size)
```

And I wrote the policy like this:

```
class Policy(nn.Module):
"""
implements both actor and critic in one model
"""
def __init__(self):
super(Policy, self).__init__()
............
def forward(self):
..............
policy = Policy()
```

Both of these are used in the model like this:

```
class Model(torch.nn.Module):
def __init__(self, ann_private, rnn, policy):
super(Model, self).__init__()
self.rnn = rnn
self.policy = policy
def forward(self, private_input, state):
.....
model = Model(rnn, policy)
```

The optimizer is written as optimizer = optim.Adam(model.parameters(), lr=1e-3)

The loss is calculated in a traditional actor-critic way. In each ‘subepisode’, we calculate the states, actions, rewards, next states and store them in a memory. Later, we randomly choose a sample from the memory and calculate 2 different kinds of losses Loss_policy and Loss_value. We use Variable(…, requires_grad = True) on the calculated losses. Then the code is:

```
loss = Lp + alpha*Lv
loss.mean().backward()
optimizer.step()
```

But, after each ‘subepisode’, I run

```
for name, param in model.named_parameters():
if param.requires_grad:
print(name, param.data)
```

to print the trainable parameters. I find all the params that I want, but I am seeing that none of them is changing. That means, the model along with the rnn and policy is not updating. I don’t know whats going on.