I am dealing with 3D image data. The bottleneck of network design is both GPU and CPU memory.
I try to estimate the GPU memory needed for a given network architecture. However, it seems that my estimation is always much lower than what the network actually consumes. In the following example,
import torch.nn as nn import torch from torch.autograd import Variable
net = nn.Sequential( nn.Conv3d(1, 16, 5, 1, 2), ) net.cuda()
input = torch.FloatTensor(1, 1, 64, 128, 128).cuda() input = Variable(input) out = net(input)
The actual GPU memory consumed is 448 MB if I add a break point in the last line and use nvidia-smi to check the GPU memory consumption. However, if I calculated manually, my understanding is that
the total consumed GPU memory = GPU memory for parameters x 2 (one for value, one for gradient) + GPU memory for storing forward and backward responses.
So the manual calculation would be 4MB (for input) + 64 MB x 2 (for forward and backward) + << 1MB (for parameters). It is roughly 132 MB. There is still a big gap from 132 MB to 448 MB. I don’t know what I am missing. Any idea on how to manually calculate the GPU memory required for a network?