As you said it stops after some 100 iterations, my best guess is the history object is growing. See this example for the explanation
output = model(input)
loss = myLossFunction(output, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Here is the problem I am talking about
# I think you are something similar to this
running_loss += loss
When you add loss in this way, you are not detaching the loss first from the graph and as a result pytorch keeps a trace of it. As you keep iterating, this trace grows in size.