Move a little data to GPU takes 5 to 10 minitues

I Just do this:

t = Variable(torch.randn(5))
t =t.cuda()
print(t)

But it takes 435 seconds in the first time,then I re-run this code,it also takes 440 seconds.I use pdb to check which part takes the most time,I find in
/anaconda3/lib/python3.6/site packages/torch/cuda/__init__:

def _lazy_new(cls, *args, **kwargs):
    _lazy_init()
    # We need this method only for lazy init, so we can remove it
    del _CudaBase.__new__
    return super(_CudaBase, cls).__new__(cls, *args, **kwargs)

super(_CudaBase, cls).__new__(cls, *args, **kwargs) takes the most time.

I add torch.cuda.synchronize() in front of the previous code,but I get the same result.But the torch.cuda.synchronize() takes the most time(434 seconds),and the previous code takes only one second.
My enviroment is:ubuntu16.04+cuda9.1+GTX1060 6GB
I’m a beginner of pytorch and CUDA,so I don’t know how to solve this problem by these imformation.Can anyone help me?

How did you install pytorch? Are you sure you installed the cuda 9 version of it? (You can check this with conda list or something like that).

Generally hanging happens if there’s a cuda version mismatch between the cuda your pytorch was compiled with and the cuda you’re running.

1 Like

I use the following commond to install pytorch:

conda install pytorch torchvision -c soumith

I thought this commond will install proper pytorch version for me before,I checked the pytorch version,its version is 0.2.0.There may be some problems with it.Thanks,you give me the inspiration to solve this problem.

As you said,I update my pytorch,I move data to GPU only need one second.Thank you for your answer,this problem wastes me a lot of time.

To install pytorch 3.0 with cuda 9 support you would do conda install pytorch torchvision cuda90 -c pytorch