How to prevent this memory leak?

jerinphilip (Jerin Philip) June 11, 2019, 5:15am 4

See if one of the following threads suits your case.

Accumulating Gradients

I want to accumulate the gradients before I do a backward pass. So wondering what the right way of doing it is. According to this article it’s (let’s assume equal batch sizes): model.zero_grad() # Reset gradients tensors for i, (inputs, labels) in enumerate(training_set): predictions = model(inputs) # Forward pass loss = loss_function(predictions, labels) # Compute loss function loss = loss / accumulation_steps …

Implementing Truncated Backpropagation Through Time autograd

Hello, I’m implementing a recursive network that is going to be trained with very long sequences. I had memory problems when training because of that excessive length and I decided to use a truncated-BPTT algorithm to train it as described here, that is, every k1 steps backpropagate taking k2 back steps checking some examples I could easily write the case when k1 = k2. However, I haven’t been able to implement the general case yet. My first idea was to freeze the gradient graph after the f…

show post in topic

Home
Categories
Guidelines
Terms of Service
Privacy Policy

Powered by Discourse, best viewed with JavaScript enabled