Hi, I am trying to build an LSTM model. I am encountering this error:
File “spam_detection_LSTM_01.py”, line 131, in
loss.backward()
File “/gpfs/software/Anaconda/envs/pytorch-latest/lib/python3.8/site-packages/torch/_tensor.py”, line 487, in backward
torch.autograd.backward(
File “/gpfs/software/Anaconda/envs/pytorch-latest/lib/python3.8/site-packages/torch/autograd/init.py”, line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
I have read several threads here with possible fixes and I have tried all of them arleady (hopefully correctly). However, the error still persists. My model class currently looks like this:
class Network(nn.Module):
def __init__(self, input_size=40, hidden_size=10, num_layers=1, batch_first=True):
super(Network, self).__init__()
self.hidden_size = hidden_size
# our LSTM model
self.LSTM = LSTM(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=batch_first)
# final fully connected layers
self.fc1 = Linear(10, 5)
self.fc2 = Linear(5, 1)
def forward(self, x):
h0 = torch.zeros(1, x.size(0), self.hidden_size)
c0 = torch.zeros(1, x.size(0), self.hidden_size)
output, hidden = self.LSTM(x, (h0.detach(), c0.detach()))
x = F.relu(self.fc1(output[:,-1,:].view(x.size(0), -1)))
x = torch.sigmoid(self.fc2(x))
return x.flatten(), hidden
I have tried many variations (for instance not including the “.detach()” methods. I still get the same error. My training loop looks like this:
start = time.time()
# training
for epoch in range(epochs):
running_loss = 0.0
# hidden = model.init_hidden(batch_size)
for i, data in enumerate(train_loader, 0):
current = time.time()
print(i, (current-start)/60)
inputs, labels = data
outputs, hidden = model(inputs)
loss = criterion(outputs, labels.float())
# running_loss += loss.item()
optimizer.zero_grad()
loss.backward()
optimizer.step()
print('Finished Training')
Does anyone know what’s wrong with my code? Where am I “trying to backward through the graph a second time”? Thanks in advance!