Is the .cuda() operation performed in place?

Is model.cuda() or model.to(device) performed in place? Or do I need to reassign the model as such?

if torch.cuda.is_available():
    model = model.cuda()

or is this ok?

if torch.cuda.is_available():
    model.cuda()

Thank you in advance.

Hi,

No, .cuda or its reverse .cpu operations are not inplace. They create a copy on device side if it does not exist.

Bests

While @Nikronicā€™s post is true for tensors (and parameters), the to(), cuda(), and cpu() calls will be executed recursively on all parameters and buffers inside an nn.Module, so you donā€™t have to re-assign your model:

model = nn.Linear(1, 1)
print(model.weight.device)
> cpu
model.cuda()
print(model.weight.device)
> cuda:0

x = torch.randn(1)
print(x.device)
> cpu
x.cuda() # ERROR!!!
print(x.device)
> cpu
3 Likes

hi, thank you very much for answering.

the reason i was asking this is that I answered a torch question on stack overflow a few months ago and it gained a lot of traction. however, i donā€™t code much in torch anymore, so i wanted to make sure everything i said was right. turns out, a few thousands people saw my post where i re-assigned the neural net as such:

if torch.cuda.is_available():
    model = model.cuda()

which is incorrect (or suboptimal) if i understand you correctly. either way, this is the post iā€™m referring to.

I wouldnā€™t say the reassignment is wrong or bad in any sense and would also make sure you are not forgetting to reassign a tensor using the to() call. Also, while this might add a slight overhead of reassigning an object, the ā€œcostā€ should be noise compared to the actual data transfer.

1 Like

How is memory allocation handled in this example? Model.to(ā€˜cudaā€™) would copy all parameters and buffers to the gpu but the same parameters and buffers would keep residing in the cpu too?

And regarding the first comment. ā€œ.cuda() & .cpu() create a copy on the device side if the model does not exist on that deviceā€.
What are the conditions for that? Does the model on the device side has to point to a variable with the same name as the model on the host side?

Yes

Not if you are overriding the same variable. I.e. model = model.to(device) will free the resources on the previously used device. model_new = model.to(device) will keep it.

Unsure, but are you asking what the conditions for not existing are?