Is there a way to assess the appropriate value to clip the gradients to, e.g can I return the gradient values from my models and compute the nth percentile and set this to the clip value?
for epoch in enumerate(n_epochs):
encoder = …
decoder = …
<<< SOME TRAINING CODE … >>>#update new loss
loss += new_loss#BackProp
loss.backward()#Clip gradients: gradients are modified in place
clip = some_value based on nth percentile of all gradients
_ = nn.utils.clip_grad_norm_(encoder.parameters(), clip)
_ = nn.utils.clip_grad_norm_(decoder.parameters(), clip)# Adjust model weights encoder_optimizer.step() decoder_optimizer.step() # Increment the schedulers scheduler_encoder.step() scheduler_decoder.step() current_learning_rate = [] for param_group in encoder_optimizer.param_groups: current_learning_rate.append(param_group['lr']) return loss, current_learning_rate