Yes, I use a combination of gradient clipping and batch normalization which has pretty much ensured that this never occurs again.
Yes, I use a combination of gradient clipping and batch normalization which has pretty much ensured that this never occurs again.