Out of memory with retain graph True

I’m training some image feature learning model using retain graph = True when calling loss.backward(), this works fine for the first 40 epochs or so. But it then breaks with the out of memory on GPU. I’m running my program on 3 Titan X GPUs and it’s only occupying 2000 MB at the beginning. I did a brief search on google and found that this might be due to the memory leakage problem with retain_graph=True. How can I fix that?

2 Likes