I am trying to understand a bit better why you run out of memory on the GPU. Specifically I would like to calculate whether something would fit or not before I try and code it up.
This wonderful answer from ptrblk, see link the below.
Gives me a value that is quite low. I don’t question the accuracy but I am missing the impact of the input. I deal with images that are 2000x2000. How do I combine facts about my input with a model size of say 400MB?
Based on model size I should be fine running on a 16GB GPU but I know that doesn’t work.
How can I get some back on envelope calculations based the data I have?
The parameters and buffers could use a small fraction of the overall memory usage depending on the model architecture. Especially for e.g. conv layers this could be true as their parameters (weight and bias) are often tiny compared to the intermediate forward activations. This post gives you a better estimate, but also note that additionally to these activations also internal temp. tensors could be created, the CUDA context would use memory on the device, libs could use workspaces (e.g. cuDNN and cuBLAS) etc. which would depend on the actual execution of the model.