How to allocate large intermediate variables in advance?

I have a large intermediate variable and I allocated the memory before doing optimization. But RuntimeError happens in the second epoch. When I moved the variable allocation into the iteration, the error disappeared, but it will allocate a large memory every epoch (I thought this process will take a lot of time). Could anyone explain why the RuntimeError happens and how can I avoid it? Can I save time by allocating large intermediate variable in advance?

var1 = torch.rand(128, 128, requires_grad=True)
m1 = var1.new_empty((4, *var1.shape)) # intermediate variable
a = torch.linspace(1, 10, 4).reshape(4, 1, 1)

for i in range(10):
    #m1 = var1.new_empty((4, *var1.shape)) # No error if this line is uncommented
    m1[:] = torch.exp(var1[None, :] ** 2)
    l = loss(m1, data)
    with torch.no_grad():
        var1 -= var1.grad * 0.1

#RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results #have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the #first time.