Leaf variable has been moved into the graph interior: extending mse_loss with a gradient penalty

I am trying to extend the mse_loss with an mse_loss with respect to a gradient penalty.

    // - Initialize training data 
    trainingPoints = torch::zeros({static_cast<long>(trainingLabels.size()),3},
    trainingValues = torch::zeros({static_cast<long>(trainingLabels.size()),1});

    // Data initialization ...

    torch::Tensor trainingPrediction = torch::zeros_like(
    torch::Tensor trainingMse = torch::zeros_like(trainingValues);

    // Solver setup ...

    for (; epoch <= maxIterations; ++epoch) 
        // Training

        trainingPrediction = nn->forward(trainingPoints);
        auto predGrad = torch::autograd::grad(
        trainingMse = mse_loss(trainingPrediction, trainingValues);
        // TODO: replace the current mse_loss with something like this, but adapt the weights
        /*trainingMse = 0.5*mse_loss(trainingPrediction, trainingValues)*/
         /*+ 0.5*mse_loss(at::norm(predGrad[0]), torch::ones_like(trainingValues)); */


This gives me an error:

terminate called after throwing an instance of ‘std::logic_error’
what(): leaf variable has been moved into the graph interior

in trainingMse.backward(). I tried stepping through in gdb after adding a breakpoint there, this was a dead end.

The trainingMse is a leaf variable, does trainingMse.backward() represent an in-place change of trainingMse as described in this post?

How to solve this?