I am currently trying DQN for Tic Tac Toe. I am stuck with the following error:
element 0 of tensors does not require grad and does not have a grad_fn
I think the error results whenever the game is played to the end and reaches a terminal state, as the target is then manually set to some value and now the gradient cannot be computed during the backward propagation. How can I fix this?
Here is my code for the updating of the NN:
def update_NN(state, next_state, action, player, discount, lr, loss_all): pred = torch.tensor(net(torch.tensor(state).float().view(-1, 9)).squeeze().detach().numpy()[action]) reward = 0 winner, game_status = check_result(next_state) if game_status == 'Done' and winner == player: reward = 100 if game_status == 'Done' and winner != player: reward = -1 if game_status == 'Draw': reward = 10 if next_state.count(0) == 0: target = torch.tensor(reward).float() else: target = torch.tensor(reward).float() + discount * torch.max(net(torch.tensor(next_state).float())) # Evaluate loss loss = loss_fn(pred, target) print(loss) loss_all.append(loss) optimizer.zero_grad() # Backward pass loss.backward() # Update optimizer.step()