GPU memory increasement for multi-task learning

Hello. I have some questions about my GPU memory usage when doing multi-task training. Suppose I have two tasks which share the shallow conv layers and then separated into two branches and calculate two losses. When the two losses are added and then backward, like loss = loss1 + loss2; loss.backward(), the GPU memory usage is about 7971MB, but when only one of the two losses is backward, like loss=loss1, loss.backward(), the GPU memory usage increases to 88xx MB.
I cannot figure out the explicit reason why the memory usage increases a lot. The possible reason I thought is about the unreleased graph in loss2 part, but I still dont know exactly what are included in the increased memory. Please help me and thank you very much!

could you show your code?

Thanks for replying! It is a big project and the code is a bit complicated, so can you detail the part you want to check?

PS: If I use the form: loss = loss1 + 0*loss2, the memory usage is normal (7971MB).