Proper way to do gradient clipping?

My bad, I thought what you suggest is that if you do gradient clipping, then you should (for some reason) use custom updates instead of optimizer.step(). Now I got it, you meant that if you use custom updates, then you should not use optimizer.step() (to avoid mixing custom and auto updates). Makes sense!

1 Like