Is the .cuda() operation performed in place?

Is model.cuda() or model.to(device) performed in place? Or do I need to reassign the model as such?

if torch.cuda.is_available():
    model = model.cuda()

or is this ok?

if torch.cuda.is_available():
    model.cuda()

Thank you in advance.

1 Like

Hi,

No, .cuda or its reverse .cpu operations are not inplace. They create a copy on device side if it does not exist.

Bests

1 Like

While @Nikronic’s post is true for tensors (and parameters), the to(), cuda(), and cpu() calls will be executed recursively on all parameters and buffers inside an nn.Module, so you don’t have to re-assign your model:

model = nn.Linear(1, 1)
print(model.weight.device)
> cpu
model.cuda()
print(model.weight.device)
> cuda:0

x = torch.randn(1)
print(x.device)
> cpu
x.cuda() # ERROR!!!
print(x.device)
> cpu
5 Likes

hi, thank you very much for answering.

the reason i was asking this is that I answered a torch question on stack overflow a few months ago and it gained a lot of traction. however, i don’t code much in torch anymore, so i wanted to make sure everything i said was right. turns out, a few thousands people saw my post where i re-assigned the neural net as such:

if torch.cuda.is_available():
    model = model.cuda()

which is incorrect (or suboptimal) if i understand you correctly. either way, this is the post i’m referring to.

I wouldn’t say the reassignment is wrong or bad in any sense and would also make sure you are not forgetting to reassign a tensor using the to() call. Also, while this might add a slight overhead of reassigning an object, the ā€œcostā€ should be noise compared to the actual data transfer.

1 Like

How is memory allocation handled in this example? Model.to(ā€˜cuda’) would copy all parameters and buffers to the gpu but the same parameters and buffers would keep residing in the cpu too?

And regarding the first comment. ā€œ.cuda() & .cpu() create a copy on the device side if the model does not exist on that deviceā€.
What are the conditions for that? Does the model on the device side has to point to a variable with the same name as the model on the host side?

Yes

Not if you are overriding the same variable. I.e. model = model.to(device) will free the resources on the previously used device. model_new = model.to(device) will keep it.

Unsure, but are you asking what the conditions for not existing are?