I am trying to extend the `mse_loss`

with an `mse_loss`

with respect to a gradient penalty.

```
// - Initialize training data
trainingPoints = torch::zeros({static_cast<long>(trainingLabels.size()),3},
torch::requires_grad());
trainingValues = torch::zeros({static_cast<long>(trainingLabels.size()),1});
// Data initialization ...
torch::Tensor trainingPrediction = torch::zeros_like(
trainingValues,
torch::requires_grad()
);
torch::Tensor trainingMse = torch::zeros_like(trainingValues);
// Solver setup ...
for (; epoch <= maxIterations; ++epoch)
{
// Training
optimizer.zero_grad();
trainingPrediction = nn->forward(trainingPoints);
auto predGrad = torch::autograd::grad(
{trainingPrediction},
{trainingPoints},
{torch::ones_like(trainingValues)},
true
);
trainingMse = mse_loss(trainingPrediction, trainingValues);
// TODO: replace the current mse_loss with something like this, but adapt the weights
/*trainingMse = 0.5*mse_loss(trainingPrediction, trainingValues)*/
/*+ 0.5*mse_loss(at::norm(predGrad[0]), torch::ones_like(trainingValues)); */
trainingMse.backward();
optimizer.step();
```

This gives me an error:

terminate called after throwing an instance of ‘std::logic_error’

what(): leaf variable has been moved into the graph interior

in `trainingMse.backward()`

. I tried stepping through in `gdb`

after adding a breakpoint there, this was a dead end.

The `trainingMse`

is a leaf variable, does `trainingMse.backward()`

represent an in-place change of `trainingMse`

as described in this post?

How to solve this?