About volatile and saving GPU memory

Hi everyone!

I am new in PyTorch and I constantly have to solve problems with memory.

  1. Why does my data loader (for train) eat so much memory?
  2. How can I care about memory when I use data loader for train/validation?
  3. Is it necessary to point “volatile” in inputs and targets and it will help?
  4. Should I use cuda.empty_cache() between two epochs? Or maybe any others?
  5. I noticed that volatile flag has been removed in master. Why?

Thanks in advance!

1 & 2. Usually data loader doesn’t need to put things onto GPU. Even if it does, it shouldn’t use much GPU memory.
3. Yes when testing.
4. empty_cache has no effect on the amount of memory used by pytorch.
5. Because Tensor and Variable are now the same class, and torch.no_grad is a more reasonable approach.

1 Like


  1. Testing and validation too, right?
  1. Yep. Anywhere you don’t need grads.