I started playing around with new Amp interface. The thing is: I am training GANs and my models use spectral norm.
I would like to know how things work with mixed precision when using spectral norm. Are my spectraly normalized weights eligible for fp16 precision? Do I need to do something extra to get things working (such as increasing spectral norm eps?)
AMP will autocast operators as described in this list. Since spectral_norm uses some of them (e.g. matmul) this operation would be performed in FP16. If you want to keep the calculation in FP32, you can disable autocast for this call.
@ptrblck, thank for you answer. I have some questions about AMP, maybe eu can clarify for me.
1 - About weight initilization: let’s say my initializer produces very small weights, will them flush to zero when run inside autocast?
2 - What about model inputs: if I am working with vector of very small floats as inputs to my model, what happens to them?
but then I would doubt that the small FP32 values would contribute to your model in any way.
Note that a GradScaler should be used for mixed-precision training together with autocast to avoid gradient underflow.
It would be interesting to know more about your use case, i.e. what kind of model “needs” values close to zero to perform properly.