Half precision Convolution cause NaN in forward pass

Hello,
I’m using CUDA v11.3 on a GTX 1660 TI and I’m seeing the same issues you described, while with my old GTX 1050 the same code works without any problems.
I will try to downgrade to CUDA v10.2 as you did and see if it works, but I was curious to know if in the meantime you managed to upgrade to a more recent CUDA version without encountering again the same problem?
Thank you.