Calling model.half() manually can easily yield NaN and Inf outputs, as some internal values can overflow.
We recommend to use automatic mixed precision training as described here, which takes care of these issues for you.
To use amp you would have to install the nightly binary or build from master.
1 Like