GPU memory keep increasing

Hello,

I try to train an SSDA model.
The following is part of code for training.

        data = torch.cat((im_data_s, im_data_t), 0)
        target = torch.cat((gt_labels_s, gt_labels_t), 0)
        output = G(data)
        out1 = F1(output)
        loss = criterion(out1, target)
        loss.backward(retain_graph=True)
        optimizer_g.step()
        optimizer_f.step()
        zero_grad_all()
  
     
        output = G(im_data_tu)  <================ lead to increase gpu memory
        loss_t = adentropy(F1, output, 0.1)
        loss_t.backward()
        optimizer_f.step()
        optimizer_g.step()
       
        G.zero_grad()
        F1.zero_grad()
        zero_grad_all()

Result shows issue of memory raising:

Step:100
[Memory usage:2938.139136 MB]

Step:200
[Memory usage:3676.467712 MB]

Step:300
[Memory usage:4413.829632 MB]

Not sure what happen, is there any way to fix that problem?

Thanks!

Could you remove retain_graph=True or why is it needed in your code?

Indeed, I don’t need to retain the graph in this case but still confuse why memory keep growing.

For each training step, this loss.backward(retain_graph=True) retain the graph and next

 output = G(im_data_tu) 
 loss_t = adentropy(F1, output, 0.1)
 loss_t.backward()

create another graph and these should be the maximum usage of memory. Did increased memory come from the retained graph ? I did comment out output = G(im_data_tu) and memory usage seem normal. Could you explain that ? Thanks !