GPU memory consumption in loss.backward() and optimizer.step()

I have a GPU with 12 GB memory, the memory consumption before loss.backward() step is around 4.5 GB, after which it becomes 9 GB and it runs out of memory at optimizer.step(). Any idea why this might be happening. Also is there a way by which I can check what variables are consuming memory?

1 Like

One more follow up question… is it possible to do some updates during optimizer.step() on gpu and some on cpu() depending on the memory available on gpu?

1 Like