Understanding GPU capacity and model size with large images

The parameters and buffers could use a small fraction of the overall memory usage depending on the model architecture. Especially for e.g. conv layers this could be true as their parameters (weight and bias) are often tiny compared to the intermediate forward activations.
This post gives you a better estimate, but also note that additionally to these activations also internal temp. tensors could be created, the CUDA context would use memory on the device, libs could use workspaces (e.g. cuDNN and cuBLAS) etc. which would depend on the actual execution of the model.