How to determine gradient clip value

Dean_Sumner · January 3, 2020, 3:24pm

Is there a way to assess the appropriate value to clip the gradients to, e.g can I return the gradient values from my models and compute the nth percentile and set this to the clip value?

for epoch in enumerate(n_epochs):

encoder = …
decoder = …
<<< SOME TRAINING CODE … >>>

#update new loss
loss += new_loss

#BackProp
loss.backward()

#Clip gradients: gradients are modified in place
clip = some_value based on nth percentile of all gradients
_ = nn.utils.clip_grad_norm_(encoder.parameters(), clip)
_ = nn.utils.clip_grad_norm_(decoder.parameters(), clip)
# Adjust model weights
encoder_optimizer.step()
decoder_optimizer.step()    

# Increment the schedulers
scheduler_encoder.step()
scheduler_decoder.step()

current_learning_rate = []
for param_group in encoder_optimizer.param_groups:
    current_learning_rate.append(param_group['lr'])

return loss,  current_learning_rate

Dean_Sumner · January 3, 2020, 4:23pm

I’m not sure if percentiles are appropriate to use here. Open to other suggestions.