Plot magnitude of gradient in SDM with training step

Vish2020 · November 6, 2020, 2:09am

I’m using CNN2D for text sentiment analysis with the following settings:

optimizer = optim.Adam(model_pyt.parameters())

criterion = nn.BCEWithLogitsLoss()

I’m currently training various neural networks and I’m trying to ascertain how the gradient is behaving at each training step and thus would like to plot the gradient magnitude vs time step to observe if the gradient is vanishing or exploding. Is there a way to do this? Also is there a way to plot the learning rate per training step so we can observe how that’s changing, as it should get smaller as we reach a stationary point on the convex loss function?

ptrblck · November 6, 2020, 11:21am

You can get all gradients by iterating the parameters after the backward operation:

for name, param in model.named_parameters():
    print(name, param.grad)

Using this approach you can store the gradient magnitude and plot it for each iteration using e.g. matplotlib.

To get the learning rate and other internal states you can use optimizer.state_dict() or optimizer.param_groups[id] and check the values you would like to track.

Vish2020 · November 7, 2020, 12:04am

Yes this worked thank you, one comment though is that from param.grad we need to take the clone param.grad.clone() in order to capture the gradient at each time step t that we can plot.

Thx!