Hello K. Frank,
Thank you for your swift reply. I applied the changes you recommended, and I’m now faced with the following issue:
In the second iteration of the training loop, the following runtime error is produced:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
When I specify retain_graph=True when calling backward the first time, and set it to False in the subsequent iterations, I then receive the following error message:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [200, 336]]...
I have determined that this tensor corresponds to the last layer of my fully connected network. Below, you can find the relevant code snippet from the training loop:
# update the gradients to zero
# forward pass
logits = model(x)
for i in range(18):
loss_array[i] = criterion(logits[:,i*11:(i+1)*11], torch.argmax(labels[:,i*11:(i+1)*11], dim=1))
for i in range(9):
loss_array[18+i] = criterion(logits[:,198+i*8:198+(i+1)*8], torch.argmax(labels[:,198+i*8:198+(i+1)*8], dim=1))
for i in range(6):
loss_array[27+i] = criterion(logits[:,270+i*11:270+(i+1)*11], torch.argmax(labels[:,270+i*11:270+(i+1)*11], dim=1))
loss = torch.sum(loss_array)
# backward pass
first = False
train_loss += loss.item()
# update the weights
The code is a bit more complicated than the example we discussed earlier. I had to use an array to hold all the losses as I have 33 classification tasks instead of 4.
I suspect that I am not using optimizer.step() in the correct way as that is the only operations that updates the network layers.
Would you have an insight into the problem with my code?