Lifetime of CPU tensors which use pinned memory

embg · July 14, 2024, 7:21pm

Does this code have a use-after-free bug?

def foo():
    A = torch.rand(128, device="cpu", pin_memory=True)
    B = A.cuda(non_blocking=True)
    return B

C = foo()
# do stuff with C

My concern is that when foo() returns, the refcount of CPU tensor A will hit zero and its pinned memory will be returned to PyTorch’s allocator.

Does PyTorch make sure that B keeps a reference to A (until the next stream sync) to prevent this issue? Or as the user, do I need to manually hold a reference to A until I know it’s safe to free the pinned memory?

embg · July 15, 2024, 11:49pm

Turns out this is not a use-after-free. PyTorch keeps the pinned memory alive until the memcpy is complete. Reference: [CUDA] Remove footgun related to non-blocking copies · Issue #130785 · pytorch/pytorch · GitHub