Model.cuda() takes long time, is it normal?

I use resnet50 + upsampling as my network.
model = resnet50_up()
And I want to use GPU, so I add the code:
model = model.cuda()

But it takes really really long time. I just want to know, is it normal because I use resnet50?
Besides, these code below:
import torch
from datetime import datetime

for i in range(10):
x = torch.randn(10, 10, 10, 10) # similar timings regardless of the tensor size
t1 =
print(i, - t1)

It also takes me some time.
I am so confused…

I use anaconda install the Pytorch, the version is 0.3.0. The system is Titan Xp.
cuda 8.0 cudnn 7.0.5.


The first time you use any cuda call, it needs to initialize all the different cuda states. That takes some time.

1 Like

@albanD What has changed? The first cuda call now takes 1.8 GB of RAM and 500 MB of VRAM. Is it normal?

Someone told me that the Pytorch version should upgrade to 0.3.1, and it will be faster. And I upgraded, it is really faster than before!

What changed when? What is this cuda call doing? nothing special has changed in cuda initialization in a while except in 0.3.1 iirc.

@albanD even a simple torch.tensor(0).to('cuda') allocates that space if it is the first cuda call.

On my machine with a single GPU, cuda init space takes ~300MB on the gpu. If you have more, I think it takes a bit more because of the possible p2p between devices. That is expected.