I want to do the following:

Say, I have neural network NN and inputs x.

- To train the network I will be passing gradients from a loss term:

```
y = NN(x)
loss = (y-y_actual)**2
```

- I also want to evaluate dy/dx, each pass, along with the usual d(loss)/d(weights of NN):

```
y.backward() #Stores dy/dx in x (input)
loss.grad() #For gradient descent
```

Is it possible in a single run? and if not what is the best way to evaluate each set of gradient optimally without needlessly generating dy/d(weights) or d(loss)/d(dx)