About volatile and saving GPU memory

Oktai15 · April 2, 2018, 10:56pm

Hi everyone!

I am new in PyTorch and I constantly have to solve problems with memory.

Why does my data loader (for train) eat so much memory?
How can I care about memory when I use data loader for train/validation?
Is it necessary to point “volatile” in inputs and targets and it will help?
Should I use cuda.empty_cache() between two epochs? Or maybe any others?
I noticed that volatile flag has been removed in master. Why?

Thanks in advance!

SimonW · April 3, 2018, 1:24am

1 & 2. Usually data loader doesn’t need to put things onto GPU. Even if it does, it shouldn’t use much GPU memory.
3. Yes when testing.
4. empty_cache has no effect on the amount of memory used by pytorch.
5. Because Tensor and Variable are now the same class, and torch.no_grad is a more reasonable approach.

Oktai15 · April 3, 2018, 4:20am

Thanks!

Testing and validation too, right?

jpeg729 · April 3, 2018, 9:28am

Yep. Anywhere you don’t need grads.