Leaf variable has been moved into the graph interior: extending mse_loss with a gradient penalty

I am trying to extend the mse_loss with an mse_loss with respect to a gradient penalty.

    // - Initialize training data 
    trainingPoints = torch::zeros({static_cast<long>(trainingLabels.size()),3},
                                   torch::requires_grad());
    trainingValues = torch::zeros({static_cast<long>(trainingLabels.size()),1});

    // Data initialization ...

    torch::Tensor trainingPrediction = torch::zeros_like(
        trainingValues, 
        torch::requires_grad()
    );
    torch::Tensor trainingMse = torch::zeros_like(trainingValues);

    // Solver setup ...

    for (; epoch <= maxIterations; ++epoch) 
    {
        // Training
        optimizer.zero_grad();

        trainingPrediction = nn->forward(trainingPoints);
        auto predGrad = torch::autograd::grad(
           {trainingPrediction},
           {trainingPoints},
           {torch::ones_like(trainingValues)},
           true
        );
        trainingMse = mse_loss(trainingPrediction, trainingValues);
        // TODO: replace the current mse_loss with something like this, but adapt the weights
        /*trainingMse = 0.5*mse_loss(trainingPrediction, trainingValues)*/
         /*+ 0.5*mse_loss(at::norm(predGrad[0]), torch::ones_like(trainingValues)); */

        trainingMse.backward(); 
        optimizer.step();

This gives me an error:

terminate called after throwing an instance of ‘std::logic_error’
what(): leaf variable has been moved into the graph interior

in trainingMse.backward(). I tried stepping through in gdb after adding a breakpoint there, this was a dead end.

The trainingMse is a leaf variable, does trainingMse.backward() represent an in-place change of trainingMse as described in this post?

How to solve this?