Pin_memory() on variables?

ulgus · January 4, 2018, 3:23pm

Hi there,

this is a newbie question (sorry…). I’m trying to speed-up my GPU training. The docs recommend pin_memory(), which works with tensors, but not with variables (torch version 0.3.0):

import torch
from torch.autograd import Variable
dtype = torch.FloatTensor

my_tensor = torch.randn(10,10).type(dtype)
my_variable = Variable(torch.randn(10,10).type(dtype), requires_grad=False)

my_tensor.cuda() # → works
my_tensor.pin_memory() # → works

my_variable.cuda() # → works
my_variable.pin_memory() # → does not work

gives me the error

Traceback (most recent call last):
  File "check.py", line 14, in <module>
    my_variable.pin_memory() # -> does not work
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 67, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Variable' object has no attribute 'pin_memory'

I wonder why this is… As my_variable.cuda() allows me to send Variables to GPU (just like tensors), I would expect pin_memory() to work for Variables too.

Is my understanding fundamentally wrong here?
What’s the correct way of pinning memory (a) for all of a model’s variables/tensors, (b) when only pinning some of the variables? (A web search gives me only examples in combination with DataLoader(), which does not apply in my case).

Thanks a lot!

smth · January 4, 2018, 4:56pm

you can do my_variable.data.pin_memory() for now. But do make sure you understand why you are pinning memory, easy to fall into a trap there. Pinning memory is only useful for CPU Tensors that have to be moved to the GPU. So, pinning all of a model’s variables/tensors doesn’t make sense at all.

ulgus · January 4, 2018, 5:18pm

Hi there,

thanks for the quick reply -

Pinning memory is only useful for CPU Tensors that have to be moved to the GPU.

OK, I understand now that this is about moving input data to the GPU.

But what about the model? Am I correct that the model (and its parameters + operations) reside / are carried out on the GPU anyway?

And if not: How can I taylor which operations to carry out on the GPU? (sorry if this is a stupid question, but I haven’t been able to dig up proper documentation for that…).

smth · January 4, 2018, 5:29pm

Yes, as long as you have model.cuda() somewhere.

ulgus · January 5, 2018, 1:11pm

Cool - thanks for the quick reply!

aifartist · March 4, 2023, 4:02am

So, pinning all of a model’s variables/tensors doesn’t make sense at all.

Yes, it does. I have a complex application which has a number of Tensors which I have no idea which ones and how often there are copied back and forth from the GPU. Before I go through 10’s of thousands of lines of code I didn’t write to see if any improvements can be make I’d like to just test and see what happens. Do I realize the risks of pinning a lot of memory. Yes, I am a perf expert and can judge the footprint of the process relative to the free memory I have and at most risk crashing my system.

If I have to I’ll binary patch the assembly code using “gdb -w” on the function at::_ops::is_pinned::call and see what happens. Is this a bad thing if I suppress the pin/extra copy if the memory page gets swapped out? Yes, it is bad and will probably segv. 99% chance it won’t and I’ll have some idea whether ANY speed up is possible without spending 3 days wading through large amounts of app code to only find it doesn’t help.