Dataparallel tutorial and Cublas errors

Could you post the output of python -m torch.utils.collect_env?
If you are using a Turing GPU and installed the PyTorch 1.8.0 pip wheels with the CUDA10.2 runtime, please refer to this post and either install a conda binary, the CUDA11.1 pip wheel, or any nightly release.