I think they might be related.
The slow startup time using the binaries points towards the CUDA JIT, which would be used, if a compute capability is missing, while the error you are seeing in the source build is also claiming that the expected compute capability is missing for your device.
Are you using any other GPU in this system or only the Turing one?