Question on Handling Activation Tensors in PEFT Memory Management with LoRA in PyTorch

zbtrs · June 28, 2024, 10:08am

Hello everyone,

I am currently working on PEFT memory management, specifically with the LoRA task. When I wrap a model using the Hugging Face PEFT library, it essentially freezes the backbone model’s parameters by setting requires_grad=False.

My first question is: Does PyTorch’s memory management strategy release activation tensors that are not used during the backward process? By “not used,” I mean those activation tensors that would have been used to calculate gradients for the backbone model’s weights, which are now frozen.

If PyTorch retains these activation tensors in memory, how can I manually prune the unnecessary ones? Is it possible to set them to None directly?

Thank you in advance for your help!

Brock_Brown · June 28, 2024, 12:06pm

No additional memory should be used for the backward pass if requires_grad is False. From the Autograd documentation:

During the forward pass, an operation is only recorded in the backward graph if at least one of its input tensors require grad. During the backward pass (.backward() ), only leaf tensors with requires_grad=True will have gradients accumulated into their .grad fields.

Also, if you ever want to manually relinquish memory of the graph from a tensor, you can detach a tensor with some_tensor = some_tensor.detach().

zbtrs · June 28, 2024, 12:46pm

Hi Brock_Brown!
Thank you for your reply!
Can I understand it this way: when a calculated activation tensor is not used to update a gradient, it will not appear in the computation graph, and thus PyTorch will release the memory for this tensor. In other words, PyTorch internally helps to prune the tensors that are not needed in the backward pass?
And another question: Is this achieved through Python’s garbage collection (gc)?

Brock_Brown · June 28, 2024, 1:02pm

Yup, it will not appear in the computation graph. Memory is relinquished through the garbage collector whenever there are no references to an object in Python.

Brock_Brown · July 2, 2024, 2:41pm

Forgot a very important part here, the GPU does not clear its memory unless you have deleted the tensors (or just make sure there are no references to them) and you’ve cleared the cache with torch.cuda.empty_cache().

zbtrs · July 2, 2024, 2:51pm

Thank you very much!