Hi everyone,
I have a question regarding GPU memory usage when loading identical models in PyTorch.
I am working on a project where I need to load the same model twice onto the GPU. However, I noticed that when I load the second model, the GPU memory usage only increases slightly, instead of doubling as I expected.
Why does PyTorch share memory between identical models? Is this an optimization feature or a limitation of how PyTorch handles GPU memory?
I would greatly appreciate it if someone could explain this behavior in detail or point me to relevant resources.