Are there any plans for a fix for CUDA 11?

There has been a problem for a very long time. CUDA 11 is slower than CUDA 10.

GitHub - issue

My 5-year-old laptop with an Nvidia 940M (CUDA 10) video card works better than the 3070TI (CUDA 11) - This is nonsense!

For such a long time, this has not been fixed, that I already regretted that I bought the 3070ti

The issue is most likely not caused by CUDA, but cuDNN as described in this and other issues.
You could check the 1.9.0 binaries, which fixed some missing cutlass kernels while statically linking cuDNN, build from source with the latest release, or try the nightly binaries with cuDNN8.2.2.

I don’t really like doing these things. I’m just wondering when a ready-made library will be available (pip) with a fix for this problem.

Thanks for the answer!

I’m experiencing the same slowdown on our own CUDA code which is not related to cuDNN. We do use cuDNN also but the slowdown is more for our own kernels. Unfortunately we have to use CUDA11 to support Ampere in cuDNN and then took a 30-40% performance hit on our own code.

If the slowdown is caused in your custom CUDA code by updating the CUDA toolkit, I would recommend to post this issue in the NVIDIA discussion board so that the compiler devs are aware of it.

EDIT: In case you are seeing the slowdown in 11.2, could you update to >=11.2U2 and check the performance again? We were aware of an increase in local memory usage for 11.2 and 11.2U1, which was fixed in newer versions.