Intermediary results are values from the forward pass that are needed to compute the backward pass.
For example, if your forward pass looks like this and you want gradients for the weights.
middle_result = first_part_of_net(inp)
out = middle_result * weights
When computing the gradients, you need the value of middle_result
. And so it needs to be stored during the forward pass. This is what I call intermediary results.
These intermediary results are created whenever you perform operations that require some of the forward tensors to compute their backward pass.
To reduce memory usage, during the backward pass, these are deleted as soon as they are not needed anymore (of course if you use this Tensor somewhere else in your code you will still have access to it, but it won’t be stored by the autograd engine anymore).