[nonissue] Autograd fails when using half-precision - overflow on matrix size

Have a look at this post for some info on FP16 training.