Hi,
I am trying to do a classification problem using NNs,but I get the following error when calling backward() :
Trying to backward through the graph a second time, but the buffers have already been freed
I can’t understand what s wrong with my computation graph since I am only passing through my model and then computing the loss.
def forward(self, x):
output = torch.softmax(self.out.forward(x), dim=-1)
return output
def train(self,some_params):
optim.zero_grad()
output = self(train_data)
likelihood = output.gather(1,train_labels[:, n].detach().long().unsqueeze(1))
loss = -torch.log(likelihood)
loss = torch.mean(loss)
loss.backward()
I think you should not have forward
inside forward
. This would be a recursion.
self.out.forward(x) calls the method of a nn.Linear.I don’t think this affects the code
If you wish to backward pass the model a second time use retain_graph=True. This preserves the intermediate outputs required to compute gradients.
Sounds interesting. I am still novice. If out is a module it may have forward() in which case you may call that directly like out()
.
I know that,but I am getting this error after the first run. And even if I would enter more than once in the loop(supposing I wouldn t have this initial error) things should be right too since I call backward only once in the for. Am I missing something?
Could you post a small executable code snippet so that we can debug it?
It looks like the posted methods are part of a class implementation and I’m not sure how you are using them.
It looks like this
class SudokuSolver(nn.Module):
def __init__(self, in_size, hidden_size, out_size):
super().__init__()
self.in_size = in_size
self.hidden_size = hidden_size
self.out_size = out_size
#self.hidden = nn.Linear(in_size, hidden_size)
#for now I am using a single layer
self.out = nn.Linear(hidden_size, out_size)
def forward(self, x):
output = torch.softmax(self.out.forward(x), dim=-1)
return output
def train(self, train_data, train_labels,
verbose=100, epochs=100, lr=0.1, l2_weight=0,
validation_data=None, validation_labels=None):
optim = torch.optim.SGD(self.parameters(), lr=lr, weight_decay=l2_weight)
train_loss = []
validation_loss = []
loss_fun = nn.MSELoss()
print('start')
for e in range(epochs):
for n in range(self.in_size):
optim.zero_grad()
output = self(train_data)
likelihood = output.gather(1,train_labels[:, n].detach().long().unsqueeze(1))
loss = -torch.log(likelihood)
loss = torch.mean(loss)
#here it drops the error(e = 0)
loss.backward()
optim.step()
#complete the sudoku
j = torch.arange(train_data.detach().shape[0])
train_data[j, n] = train_labels[j, n].detach()
train_loss.append(loss.detach().numpy())
if verbose!=0 and e%verbose==0:
print(loss.detach())
I am trying to solve a sudoku using only nn.Linears. My approach is to predict every digit of my sudoku. After one prediction I am adding the correct digit to my training data and then I am running it again till I fill the whole sudoku. It stops after the first loss.backward().
If I just guess some input parameters and shapes, your code seems to work:
model = SudokuSolver(5, 5, 2)
train_data = torch.randn(10, 5)
train_labels = torch.randint(0, 2, (10, 10)).float()
model.train(train_data, train_labels, verbose=1)
Could you check what’s different and correct some parameters?
I finally figured out. I was setting to true the requires_grad for my input tensor. I guess that being an empty grad caused this error,right?