Model always allocate some memories on gpu:0

I have a workstation with 3 gpus. Whenever I run a model on the device other than gpu:0 (say gpu:1), the model still allocates some additional memories on gpu:0 (from my observation, it varies between 600M to 900M, which seems depend on the model I am training). I call it additional memories because when I run the same model on the gpu:0, this size of the memories won’t be allocated.
The model can run, but the problem is a little annoying. Does anyone knows what’s going on here.

1 Like

Initialization depends on gpu’s model, some need more some less.

I don’t get it. If the memory is used for intialization, the memory should be release as soon as the initialization process complete, right? Or, I should manually release the memory by some hack?

It’s something that CUDA needs to work. It is not like a 1st step which is realeased rather than loading packages do make that specific GPU to work.
As I aforementioned it depends on the model but that behavior is okay.

Alright, it seems like a wired behavior for me, because the model don’t need the memory at all after initiallization. Is there any API I can use to manually release the memory?

It’s not a pytorch issue rather than nvidia’s. I don’t think they would waste memory since it’s a widely stable and developed librabry but your question is out of my knowledge. @ptrblck may help you as he is from nvidia I think.

Thanks! I think maybe I should not be too worry about it.

It’s okay :slight_smile: Being curious is an excellent way to learn, if you discover it write me back!

Based on the size, the CUDA context seems to be initialized on GPU0 (and I thought we got rid of this, but cannot find the issue on GitHub).

Anyway, if you only want to use a specific device, you execute your script via

CUDA_VISIBLE_DEVICES=1,2 python script.py

to mask all other devices.

Note that internally in your script, the GPUs will be remapped starting at index0.

1 Like