Check the norm of gradients

doesn’t work in my case, cause the grad can be none,
I use:

    total_norm = 0
    parameters = [p for p in model.parameters() if p.grad is not None and p.requires_grad]
    for p in parameters:
        param_norm = p.grad.detach().data.norm(2)
        total_norm += param_norm.item() ** 2
    total_norm = total_norm ** 0.5
    return total_norm

This works, I printed out the gradnorm and then clipped it using a restrictive clipping threshold.