Is model.cuda()
or model.to(device)
performed in place? Or do I need to reassign the model as such?
if torch.cuda.is_available():
model = model.cuda()
or is this ok?
if torch.cuda.is_available():
model.cuda()
Thank you in advance.
Is model.cuda()
or model.to(device)
performed in place? Or do I need to reassign the model as such?
if torch.cuda.is_available():
model = model.cuda()
or is this ok?
if torch.cuda.is_available():
model.cuda()
Thank you in advance.
Hi,
No, .cuda or its reverse .cpu operations are not inplace. They create a copy on device side if it does not exist.
Bests
While @Nikronicās post is true for tensors (and parameters), the to()
, cuda()
, and cpu()
calls will be executed recursively on all parameters and buffers inside an nn.Module
, so you donāt have to re-assign your model
:
model = nn.Linear(1, 1)
print(model.weight.device)
> cpu
model.cuda()
print(model.weight.device)
> cuda:0
x = torch.randn(1)
print(x.device)
> cpu
x.cuda() # ERROR!!!
print(x.device)
> cpu
hi, thank you very much for answering.
the reason i was asking this is that I answered a torch question on stack overflow a few months ago and it gained a lot of traction. however, i donāt code much in torch anymore, so i wanted to make sure everything i said was right. turns out, a few thousands people saw my post where i re-assigned the neural net as such:
if torch.cuda.is_available():
model = model.cuda()
which is incorrect (or suboptimal) if i understand you correctly. either way, this is the post iām referring to.
I wouldnāt say the reassignment is wrong or bad in any sense and would also make sure you are not forgetting to reassign a tensor using the to()
call. Also, while this might add a slight overhead of reassigning an object, the ācostā should be noise compared to the actual data transfer.
How is memory allocation handled in this example? Model.to(ācudaā) would copy all parameters and buffers to the gpu but the same parameters and buffers would keep residing in the cpu too?
And regarding the first comment. ā.cuda() & .cpu() create a copy on the device side if the model does not exist on that deviceā.
What are the conditions for that? Does the model on the device side has to point to a variable with the same name as the model on the host side?
Yes
Not if you are overriding the same variable. I.e. model = model.to(device)
will free the resources on the previously used device. model_new = model.to(device)
will keep it.
Unsure, but are you asking what the conditions for not existing are?