I want to do mix precision training using 8-bits. Is there an example? What is the best way to fakequantize the gradient in backward? Great thanks!
I want to do mix precision training using 8-bits. Is there an example? What is the best way to fakequantize the gradient in backward? Great thanks!