I know if I run the code below, the memory would increase
loss = 0
for i in range(n)
loss_part = criteria()
loss += loss_part
loss.backward()
But why does it increase fast and never release the memory it allocated ?
The case is, I have different dimension of data in a batch ( sequence in different length ).
So I wanna to calculate every criteria( seq_i )
and then sum them.
One approach is I concatenate every seq_i, and only use criteria( seq_sum, target)
once. Unfortunately it cannot fixed the memory increasing problem. Thanks to all previously.
Bs, I also used list to store some data (Tensor.cuda()
).
New It seems the increase is not caused by accumulative loss. Need help to close this.