Calculating loss consumes a lot of RAM(cpu)

asz18351 · August 25, 2024, 2:09am

I wrote a custom loss and used GPU to train my model, but when calculating the loss, the CPU RAM usage continued to rise. I would like to know what caused this problem. (Is it related to computational graphs?)

The following is the code to reproduce the problem. Similar operations are performed in the loss function I designed.

import torch
import sys
import tqdm

b = torch.zeros(5).cuda()
d1 = 3*torch.zeros(1,requires_grad=True).cuda()
d2 = 3*torch.zeros(1,requires_grad=True).cuda()
d3 = 3*torch.zeros(1,requires_grad=True).cuda()
d4 = 3*torch.zeros(1,requires_grad=True).cuda()
d5 = 3*torch.zeros(1,requires_grad=True).cuda()
d6 = 3*torch.zeros(1,requires_grad=True).cuda()
d7 = 3*torch.zeros(1,requires_grad=True).cuda()
d8 = 3*torch.zeros(1,requires_grad=True).cuda()

for i in tqdm(range(1000000)):
    b +=(d1*d2*d3*d4*d5*d6*d7*d8)

ptrblck · August 27, 2024, 7:52pm

Yes, the computation graph will be created and attached to b in each iteration thus storing all needed metadata as well. How large is the increase in each iteration?

asz18351 · August 30, 2024, 5:18pm

Sorry for the late reply.
After 1,000,000 iterations, RAM usage increased by 12.2GB.

ptrblck · August 30, 2024, 5:27pm

Thanks for checking! I would assume this increase might be expected as it comes down to 12.2 * 1024**2 / 1000000 = 12.7926272 kB per iteration, which might fit the metadata. CC @albanD to correct me in case ~12kB are unexpected.

asz18351 · August 30, 2024, 5:44pm

Thank you for your reply. So if my custom loss function requires a similar calculation process (finally requiring backpropagation), is there a better way to write the program that can reduce RAM usage?