I am currently trying DQN for Tic Tac Toe. I am stuck with the following error:
element 0 of tensors does not require grad and does not have a grad_fn
I think the error results whenever the game is played to the end and reaches a terminal state, as the target is then manually set to some value and now the gradient cannot be computed during the backward propagation. How can I fix this?
Here is my code for the updating of the NN:
def update_NN(state, next_state, action, player, discount, lr, loss_all):
pred = torch.tensor(net(torch.tensor(state).float().view(-1, 9)).squeeze().detach().numpy()[action])
reward = 0
winner, game_status = check_result(next_state)
if game_status == 'Done' and winner == player:
reward = 100
if game_status == 'Done' and winner != player:
reward = -1
if game_status == 'Draw':
reward = 10
if next_state.count(0) == 0:
target = torch.tensor(reward).float()
else:
target = torch.tensor(reward).float() + discount * torch.max(net(torch.tensor(next_state).float()))
# Evaluate loss
loss = loss_fn(pred, target)
print(loss)
loss_all.append(loss)
optimizer.zero_grad()
# Backward pass
loss.backward()
# Update
optimizer.step()