Hi, I’m trying to implement a simple Reinforcement Learning model with training, but on the line “loss.backward();” - I get “leaf variable has been moved into the graph interior”. Can anyone help?
(I have looked at other posts here from this error but I’ve been able to spot the issue.)
struct CNetwork : torch::nn::Module
{
torch::nn::Linear fc1{nullptr};
torch::nn::Linear fc2{nullptr};
torch::nn::Linear fc3{nullptr};
torch::optim::Adam* optimiser;
}
CNetwork::CNetwork()
{
fc1 = register_module("fc1", torch::nn::Linear(n_features, 2048));
fc2 = register_module("fc2", torch::nn::Linear(2048, 2048));
fc3 = register_module("fc3", torch::nn::Linear(2048, n_outputs));
torch::optim::Adam optimiser(this->parameters(), torch::optim::AdamOptions(2e-4).beta1(0.5));
}
void CCC::learning_step()
{
// get the model's prediction of what will happen
torch::Tensor current_state = get_current_state_tensor(true);
torch::Tensor prediction = network.forward(current_state);
auto prediction_accessor = prediction.accessor<float,1>();
// use a policy to find the next action to try
int action_id = choose_next_action_id_to_try();
Cascai_Action* action = action_list[action_id];
action->activate_action();
// n game ticks
WorldState* ws = hog->world_state;
for (int i = 0; i < N_TICKS_PER_AI_TICK; i ++)
{
ws->tick();
}
// measure our success metric
F32 success_at_this_state = learning_success_function(ws, hog);
// tick the AI so the new features are loaded
hog->tick_ai(AI_TICK_MS);
torch::Tensor new_state = get_current_state_tensor(false);
torch::Tensor prediction_from_new_state = network.forward(new_state);
auto prediction_from_new_state_accessor = prediction_from_new_state.accessor<float,1>();
torch::Tensor argmax_tensor = prediction_from_new_state.argmax();
int argmax = *argmax_tensor.data_ptr<long>();
F32 max = prediction_from_new_state_accessor[argmax];
F32 desired_q_score = success_at_this_state + max;
torch::Tensor desired_prediction = torch::empty({get_n_actions()}, torch::requires_grad(false));
auto desired_prediction_accessor = desired_prediction.accessor<float,1>();
for (int i = 0; i < get_n_actions(); i ++)
{
desired_prediction_accessor[i] = prediction_accessor[i];
}
desired_prediction_accessor[action_id] = desired_q_score;
torch::Tensor loss = torch::binary_cross_entropy_with_logits(prediction, desired_prediction.detach());
loss.backward();
network.optimiser->step();
}
This happens because one of your parameters (leaf Tensor that requires grad) has been modified inplace. Or rather a view of that Tensor has been modified inplace.
I am not sure exactly where you do that in your code but you will have to remove this inplace op.
If you run from python, you can enable the anomaly mode to get the more information about where the error comes from.
This is only a problem if the original Tensor requires gradients when you modify it and you do so in a differentiable way.
In general, you can use the torch::NoGradGuard guard; to disable gradients in the local scope. You can use that to prevent any issue while you initialize a Tensor.
That being said, in the code above, it seems like your original Tensor does not require gradients so it might not be the place?