Slicing tensor with pin_memory=True

Usually pin_memory is used when creating dataloaders, to allow CUDA to use DMA (direct memory access), but I noticed that it’s also possible to create tensors with pin_memory=True.

I’m curious what happens if I slice a tensor in pinned memory. My guess is that CPU would have to first copy the sliced tensor to pinned memory and then let CUDA copy from it. Am I right?

I assume that by slice you mean subset of a tensor right? like, 1)?
If so, the new tensor is actually looking at the same memory as the original one. So it is still pinned memory and no need to copy it.

Is that still the case of advanced indexing? For example, is the output tensor below also in pinned memory? I assumed that since now the subset of the tensor has non-contiguous memory, CUDA would have difficulties copying through DMA

>>> x = torch.randn(3,4)
>>> x[range(3), [0,1,2]]
tensor([ 0.0127, -0.2941, -2.1625])


I just found out that there’s .is_pinned() method I can use to check if the tensor is in pinned memory, and it seems like the above is not pinned.

>>> x = torch.randn(3, 4, pin_memory=True)
>>> x[range(3), range(3)].is_pinned()

You can indeed check with .is_pinned().
Advanced indexing does not make any guarantee about copying or not unfortunately when you get the new Tensor. y = x[0] will most likely share memory.
Note though that x[bar] = foo will write into x even though y = x[bar]; y = foo might or might not :smiley:

1 Like