Leaf variable has been moved into the graph interior 33

Hi, I’m trying to implement a simple Reinforcement Learning model with training, but on the line “loss.backward();” - I get “leaf variable has been moved into the graph interior”. Can anyone help?

(I have looked at other posts here from this error but I’ve been able to spot the issue.)

struct CNetwork : torch::nn::Module
{
torch::nn::Linear fc1{nullptr};
torch::nn::Linear fc2{nullptr};
torch::nn::Linear fc3{nullptr};
torch::optim::Adam* optimiser;
}

CNetwork::CNetwork()
{
fc1 = register_module("fc1", torch::nn::Linear(n_features, 2048));
fc2 = register_module("fc2", torch::nn::Linear(2048, 2048));
fc3 = register_module("fc3", torch::nn::Linear(2048, n_outputs));
torch::optim::Adam optimiser(this->parameters(), torch::optim::AdamOptions(2e-4).beta1(0.5));
}

void CCC::learning_step()
{
// get the model's prediction of what will happen

    torch::Tensor current_state = get_current_state_tensor(true);

    torch::Tensor prediction = network.forward(current_state);
    auto prediction_accessor = prediction.accessor<float,1>();

    // use a policy to find the next action to try

    int action_id = choose_next_action_id_to_try();

    Cascai_Action* action = action_list[action_id];

    action->activate_action();

    // n game ticks

    WorldState* ws = hog->world_state;

    for (int i = 0; i < N_TICKS_PER_AI_TICK; i ++)
    {
            ws->tick();
    }

    // measure our success metric

    F32 success_at_this_state = learning_success_function(ws, hog);

    // tick the AI so the new features are loaded

    hog->tick_ai(AI_TICK_MS);

    torch::Tensor new_state = get_current_state_tensor(false);

    torch::Tensor prediction_from_new_state = network.forward(new_state);

    auto prediction_from_new_state_accessor = prediction_from_new_state.accessor<float,1>();

    torch::Tensor argmax_tensor = prediction_from_new_state.argmax();
    int argmax = *argmax_tensor.data_ptr<long>();

    F32 max = prediction_from_new_state_accessor[argmax];

    F32 desired_q_score = success_at_this_state + max;

    torch::Tensor desired_prediction = torch::empty({get_n_actions()}, torch::requires_grad(false));
    auto desired_prediction_accessor = desired_prediction.accessor<float,1>();

    for (int i = 0; i < get_n_actions(); i ++)
    {
        desired_prediction_accessor[i] = prediction_accessor[i];
    }
    
    desired_prediction_accessor[action_id] = desired_q_score;

    torch::Tensor loss = torch::binary_cross_entropy_with_logits(prediction, desired_prediction.detach());
    loss.backward();

    network.optimiser->step();
}

Hi,

This happens because one of your parameters (leaf Tensor that requires grad) has been modified inplace. Or rather a view of that Tensor has been modified inplace.

I am not sure exactly where you do that in your code but you will have to remove this inplace op.
If you run from python, you can enable the anomaly mode to get the more information about where the error comes from.

Thanks: I’m sure this is where I do it.

How exactly do I set tensor values without modifying in place - compose the data as an array and then initiate a new Tensor with that?

Ian

This is only a problem if the original Tensor requires gradients when you modify it and you do so in a differentiable way.
In general, you can use the torch::NoGradGuard guard; to disable gradients in the local scope. You can use that to prevent any issue while you initialize a Tensor.

That being said, in the code above, it seems like your original Tensor does not require gradients so it might not be the place?

That’s right… I still get the error though. Maybe it’s in the first var:

int n_features = feature_list.size();

    torch::Tensor state_tensor = torch::ones({n_features}, torch::requires_grad(requires_grad));

    for (int i = 0; i < n_features; i ++)
    {
        state_tensor[i] = feature_list[i].value;
    }

    return state_tensor;

Any chance you could be a bit clearer about the normal way to initialise a tensor with existing data though?

Ian

Would this be the correct way of using the guard?

int n_features = feature_list.size();

    torch::Tensor state_tensor = torch::ones({n_features}, torch::requires_grad(requires_grad));
    
    {
        NoGradGuard guard;
        for (int i = 0; i < n_features; i ++)
        {
            state_tensor[i] = feature_list[i].value;
        }
    }

    state_tensor.set

    return state_tensor;

Yes that looks like a suspicious place.
And yes this is the right way to use the guard. This is the same as using with torch.no_grad() in python.

Does this fix your issue?

It does fix my issue, thank you for your time.

I have a quick follow up: I wish to store my Optimser within a struct, like this:

struct Cascai_Network : torch::nn::Module 
{
    Cascai_Network()
    {
        
    }

    Cascai_Network(int n_features, int n_outputs);

    torch::Tensor forward(torch::Tensor x);

    //

    torch::nn::Linear fc1{nullptr};
    torch::nn::Linear fc2{nullptr};
    torch::nn::Linear fc3{nullptr};

    torch::optim::Adam optimiser;
};

But I get a compile error in the default constructor which states

“torch::optim::Adam::Adam()’ is private within this context”

Have you any advice on how to structure my program to solve this?

Thank you.

I guess that you are not allowed to default construct the optimizer because it has required arguments that you must provide on construction?

I get the same error if I initialise it like this though… do you know how I can avoid initialising it here?

struct Cascai_Network : torch::nn::Module 
{
    Cascai_Network()
    {
        optimiser = torch::optim::Adam(this->parameters(), torch::optim::AdamOptions(2e-4).beta1(0.5));
    }

    Cascai_Network(int n_features, int n_outputs);

    torch::Tensor forward(torch::Tensor x);

    //

    torch::nn::Linear fc1{nullptr};
    torch::nn::Linear fc2{nullptr};
    torch::nn::Linear fc3{nullptr};

    torch::optim::Adam optimiser;
};

I think you want something like:

Cascai_Network() : optimizer(this->parameters(), torch::optim::AdamOptions(2e-4).beta1(0.5)) {}

But you might be getting the same issue for your linear layers that you should initialize the same way to give them the right feature sizes.

That did work, thank you - I wonder why the Adam constructor is private within module constructors.

Thanks again for you time.

It is private because it is only valid to create it with arguments. It cannot be created by the user without arguments.

1 Like