Strategies to save memory

arman.avesta · August 25, 2021, 3:38pm

Hi! I’m running a memory-intensive code and I want to save memory as much as possible. If I have multiple convolutional layers:

x = self.conv1(x)
x = self.conv2(x)
x = self.conv3(x)
…

the fact that I’m using the same variable name “x” doesn’t help here, because they’re all kept in PyTorch’s graph. So the block of code above is no different that this block?:

x1 = self.conv1(x)
x2 = self.conv2(x)
x3 = self.conv3(x)
…

If that’s the case, can I save memory by doing this?:

x = self.conv3(self.conv2(self.conv1(x)))

or does this occupy almost the same amount of memory?

I would appreciate your help!

gphilip · August 25, 2021, 4:34pm

As far as I understand, all the three methods you describe use essentially the same amount of memory. Not using different variable names may save the memory used for storing those variable names, not more. And this is an insignificant amount, so you should use as many variable names as will make the code easy to read.

One way to reduce memory requirements is to go through the input in small batches, and to use PyTorch DataLoader to load the data.

arman.avesta · August 25, 2021, 5:05pm

Thank you, Philip! Yes, I’m using PyTorch dataloader and my batch size is only 4, but I’m working with huge 3D brain MRI volumes

Thanks again!
Arman

gphilip · August 25, 2021, 5:17pm

In that case, you could try using checkpointing. Here is a simple introduction with (hard-on-the-eyes) examples.