Intermediate data type using model.half()

Hello!
I try to accelerate CNNs on TitanV using FP16. I use this piece of code:

# ...
model.half()
# ...
image = image.half()
#...
outputs = model(image)

but I wonder whether the intermediate results (i.e., accumulator) in convolution operations can be represented by FP16. Besides, several activation functions (e.g., Sigmoid, Tanh) may require high precision representations. As doc said,

half() Casts all floating-point parameters and buffers to half datatype.

which means those nonlinear operations still use FP16?

Thank you!

Hi,

It does convert everything to float16 which indeed is not optimal in most cases.
The nvidia folks have done a lot of testing around this and release the AMP library to automatically only convert the right part of the model to fp16 !

2 Likes