adam: how to output such intermediate values involved as 1st and 2nd moments m and v during training?

You can check the `optmizer.param_goups`

or `optimizer.state_dict()`

.

thank you but all I can see are the trained parameters etc, not the moments

I can see the entire state including the running averages:

```
model = nn.Linear(1, 1)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
out = model(torch.randn(1, 1))
out.backward()
optimizer.step()
print(optimizer.state_dict())
# {'state': {0: {'step': tensor(1.), 'exp_avg': tensor([[-0.0219]]), 'exp_avg_sq': tensor([[4.7941e-05]])}, 1: {'step': tensor(1.), 'exp_avg': tensor([0.1000]), 'exp_avg_sq': tensor([0.0010])}}, 'param_groups': [{'lr': 0.001, 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False, 'maximize': False, 'foreach': None, 'capturable': False, 'params': [0, 1]}]}
```

1 Like

thank you I saw the same output, but what is exp_avg and which are the moments?

could anyone help with my last question pls thanks

When I read this paper and source code,

exp_avg is the exponential moving average of gradient value and same as 1st moment vector

exp_avg_sq is the exponential moving average of the sqared gradient and same as 2nd moment vector.

betas are exponential decay rates.

Have a nice day.

1 Like