I’m trying to accumulate into a variable using a loop, as follows. In each iteration of the loop, I do some computation using the variables, and add the result to `total_act`

. Note that the actual operation I’m trying to do is more complicated; this is just a minimalistic example to reproduce the problem.

```
import torch
import torch.nn as nn
from torch.nn import Parameter
num_x = 20000
num_y = 1000
emb_dim = 500
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.x_embed = Parameter(torch.FloatTensor(num_x, emb_dim))
self.y_embed = Parameter(torch.FloatTensor(num_y, emb_dim))
self.w = Parameter(torch.FloatTensor(emb_dim, emb_dim))
def forward(self):
total_act = 0
for i in range(num_y):
import pdb; pdb.set_trace()
trans = self.x_embed - self.y_embed[i]
trans = torch.mm(trans, self.w) # If we remove this line, the problem doesn't occur
total_act += trans
return total_act
model = Model()
model = model.cuda()
final_act = model.forward()
```

Now what I notice is that with each iteration of the for loop, the memory occupied on the GPU keeps increasing. I’ve put a pdb in the loop so that one can track memory consumption for each iteration.

My guess is that this is due to the intermediary states of the various variables occupying space, as though it’s building one graph for every single iteration of the loop. How do I solve this problem?

Hello,

In my shallow view, you could use temporary buffer to perform calculation and release it after accumulating. But I think if the variable requires gradient, the intermediate computation graph is needed in each iteration and it is unavoidable.

If you have better methods, let me know, thank you.

Hi,

Thank you for your reply!

I didn’t quite get what you meant by “use a temporary buffer to perform calculation”. Could you provide an example?

Also, I managed to find a temporary workaround by defining a custom Autograd function and writing the backward call for it manually (code follows). This is far from satisfactory, so I’m still on the lookout for solutions.

```
import torch
import torch.nn as nn
from torch.nn import Parameter
num_x = 20000
num_y = 1000
emb_dim = 500
class MyFunc(torch.autograd.Function):
@staticmethod
def forward(ctx, X, Y, W):
ctx.save_for_backward(X, Y, W)
total_act = 0
num_y = Y.shape[0]
for i in range(num_y):
trans = X - Y[i].expand_as(X)
trans = torch.mm(trans, W)
total_act += trans
return total_act
@staticmethod
def backward(ctx, grad_output):
X, Y, W = ctx.saved_tensors
num_y = Y.shape[0]
sum_Y = torch.sum(Y, dim=0)
grad_X = num_y * torch.mm(grad_output, torch.t(W))
grad_Y = -torch.sum(torch.mm(grad_output, torch.t(W)), dim=0).repeat(num_y, 1)
grad_W = torch.mm(torch.t((num_y * X) - sum_Y.expand_as(X)), grad_output)
return (grad_X, grad_Y, grad_W)
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.x_embed = Parameter(torch.FloatTensor(num_x, emb_dim))
self.y_embed = Parameter(torch.FloatTensor(num_y, emb_dim))
self.w = Parameter(torch.FloatTensor(emb_dim, emb_dim))
self.custom_op = MyFunc.apply
def custom_forward(self):
return self.custom_op(self.x_embed, self.y_embed, self.w)
model = Model()
model = model.cuda()
final_act = model.custom_forward()
```

1 Like