PyTorch Models take up more space when sent to device

Zador · December 15, 2021, 12:53pm

Hi a have a simple question about the memory usage of PyTorch models.
I have the following code:

print('1', process.memory_info().rss / 1e+9)
model = Task()
print('2', process.memory_info().rss / 1e+9)
model.to('cuda')
print('3', process.memory_info().rss / 1e+9)

This code has the following output:

1 0.268644352
2 0.269606912
3 2.501251072

Can somebody explain to me what is going on here? In print('2... am I using the wrong function to check memory info? It seems to me like process.memory_info().rss does not recognize the memory usage until the model weights are sent to my GPU.

ptrblck · December 17, 2021, 5:43am

Could you check how much memory the model is expected to use? I would guess that the last step reports the loaded libs needed to use the GPU, not necessarily the model parameters alone.

Zador · December 31, 2021, 3:48pm

Thanks for the response. I ended up ignoring this for the problem for the past two weeks and am coming back to it now. This is interesting, I had not realized that extra libs are loaded when you initially send data to a GPU. Do you know where this is discussed?