Cublas runtime error : library not initialized at /data/users/soumith/builder/wheel/pytorch-src/torch/lib/THC/THCGeneral.c:383

I had also faced this issue even on single GPU.
I noticed that cublas samples required sudo permission to Initialize.
Also to avoid root permission, I removed the cache files in ~/.nv directory.
Hope this solution helps.