I try to accelerate CNNs on TitanV using FP16. I use this piece of code:
# ... model.half() # ... image = image.half() #... outputs = model(image)
but I wonder whether the intermediate results (i.e., accumulator) in convolution operations can be represented by FP16. Besides, several activation functions (e.g., Sigmoid, Tanh) may require high precision representations. As doc said,
half() Casts all floating-point parameters and buffers to
which means those nonlinear operations still use FP16?