Hi @jcallahan4,

Good to see your error is already solved.

Since you wanted to understand what autograd is doing, and how to get the network to use the input tensors in the computation graph, I’m adding some details in that regard:

`torch.autograd`

is pytorch’s automatic differentiation engine that, as the name suggests, deals with automatically calculating gradients for any “computational graph”.

Computational graphs are what that get build by autograd as and when tensors are subjected to mathematical operations. While building these graphs, `autograd`

also saves tensors that’ll be required to calculate the gradients wrt tensors having their `requires_grad`

attribute set to `True`

.

(So, when you use `torch.autograd.grad`

or use the `.backward`

call, these saved tensors are used).

Now, `torch.no_grad()`

basically tells autograd to look away. It can be used as a context manager so that *for any piece of code occurring within this context*, autograd shall build no graph (or will not further populate any graph that’s already there).

i.e. It’ll not track any operations.

Now, for your code, you are differentiating y (output) wrt x,

where `y = net(x)`

which essentially means `y = net.forward(x)`

.

Inside forward, output_layer(z) which is returned (and hence is essentially what gets stored in `y = net(x)`

) is a result of operations on `normalized_x`

, but `normalized_x`

is getting created as a result of operations on x under `torch.no_grad()`

.

This means even if

`y = self.input_act(self.input_layer(normalized_x))`

and

`z = self.hidden_layers(y)`

are a part of the computation graph, `normalized_x`

isn’t really.

And so when you tried to differentiate y (which is returned by forward) wrt x, it produced the error

`One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.`

So, here the error prompt is most probably talking about `normalized_x`

as the tensor that appears to not have been used in the graph.

Note: Even if `normalized_x`

is getting created as a result of operations on `x`

*whose *`requires_grad`

is set to True, it doesn’t matter. Under `torch.no_grad()`

, nothing is tracked by autograd and so all resulting tensors have their `requires_grad=False`

.

Hope this helps,

Srishti