I’m trying to add a sparsity constraint (as in a sparse autoencoder) to my model using nn.KLDivLoss() on the activation probability of the hidden units
Here’s my code:
criterion1 = nn.MSELoss() criterion2 = nn.KLDivLoss(size_average=False) optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) for epoch in range(num_epochs): for data in dataloader: img = Variable(data) #forward pass output = model(img) loss1 = criterion1(output, img) loss2 = criterion2(model.p_.log(), p) + criterion2((1-model.p_).log(),1-p) loss = loss1 + loss2 #backward pass optimizer.zero_grad() loss.backward() optimizer.step()
But then I get this error at the 2nd iteration
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
if I use just “nn.MSELoss()” as criterion everything works just fine.
Needless to say, if I put “retain_graph=True” the model starts allocating memory at every iteration and my GPU runs out of memory almost immediately.