Can we use .cpu() to save memory on GPU?

TsingZ0 · November 3, 2020, 9:39am

I try to save the memory on GPU by .cpu() using the code like the following, but failed.

import ...

model = MyModel(...)
loss = Myloss(...)

def train():
   model.to('cuda')
   loss.to('cuda')

   model.train()

   # optimize parameters

   model.cpu()
   loss.cpu()

The memory cost does not decrease. Why does .cpu() fail to move the parameters from the GPU
memory to the memory of CPU?

RaLo4 · November 3, 2020, 10:39am

Can we use .cpu() to save memory on GPU?

Since I don’t know what you are trying to do here I can’t really answer that question.
As for this question:

.cpu() is not inplace for a tensor, so assuming loss is a tensor you need to write it this way:

loss = loss.cpu()

Also! One reason why a lot of people are running out of vram is because they are trying to keep their total loss without detaching the graph or moving the tensor to cpu. Like so:

running_loss += loss     #this is wrong! don't use this!

If keeping your loss is what you are trying to do, but don’t want it to be on your gpu, then you can do this instead:

running_loss += loss.item()

This will extract the loss as a float. Completely detached and moved to cpu.

5had3z · November 3, 2020, 11:22am

You may also need to use torch.cuda.empty_cache()

TsingZ0 · November 5, 2020, 2:36am

It works! Thank you so much!

TsingZ0 · November 5, 2020, 2:49am

It works and .cpu() is really able to save memory. However, the maximum memory cost is still high. With RaLo4’s help, I know that using

saves a lot of memory.