Hello everyone,
I’m trying to add a sparsity constraint (as in a sparse autoencoder) to my model using nn.KLDivLoss() on the activation probability of the hidden units
Here’s my code:
criterion1 = nn.MSELoss()
criterion2 = nn.KLDivLoss(size_average=False)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
for data in dataloader:
img = Variable(data)
#forward pass
output = model(img)
loss1 = criterion1(output, img)
loss2 = criterion2(model.p_.log(), p) + criterion2((1-model.p_).log(),1-p)
loss = loss1 + loss2
#backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()
But then I get this error at the 2nd iteration
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
if I use just “nn.MSELoss()” as criterion everything works just fine.
Needless to say, if I put “retain_graph=True” the model starts allocating memory at every iteration and my GPU runs out of memory almost immediately.