Reduce memory usage during iterative algorithm

Hello!

This is a snippet of my code (create_graph is required in my use case, and energy is a big CNN)

for _ in range(n_steps):
  sample_energy = energy.forward(sample)
  sample_score = grad(sample_energy.sum(), sample, create_graph=True)[0]
  value, sample = f(value, sample, sample_energy, sample_score)

loss = g(value)
loss.backward()
optimizer.step()
optimizer.zero_grad()

What are my options to reduce the memory footprint of these model + grad calls?

f and g perform common, uninteresting Tensor operations like multiplication, summation, etc …

Potentially, you could try gradient clipping since you explicitly mention its a large model. Not sure what sort of dips that will cause performance metric wise.

Assuming your training follows this structure in general, you could also look into setting loss.backward(retain_graph=False) .