# Does autograd save copies of constant buffers?

If i have a pytorch module M with a constant matrix buffer T, where M.forward(x) = x @ T, If i apply this module M at several points in a deep model, will autograd save one copy of the (constant) matrix T per invocation, or will it “realize” that the matrix T is constant and just have one copy?

If `T` is registered as a buffer, it will be stored in memory once.
The output created by e.g. `x @ self.T` will of course be a new tensor.
Could you explain your question a bit more, as I think I might have misunderstood the problem?

So given a deep model `f=(f_3 . M . f_2 . M. f_1)`, when we evaluate `f(x)`, `M` is applied twice. Once to `f_1(x)` and once to `f_2(M(f_1(x)))`, the output at each point will of course be a new tensor.
If `T` is not constant, and if we needed to compute the gradient w.r.t. T and the input to M, we would need to save the both T and the inputs to M in the forward pass to be able to compute the backward pass. But since T is constant, and we only need to compute the gradient w.r.t. x, the backward pass of M is itself constant (it’s T).
The question is whether I can safely assume that autograds computation graph is constructed in such a way that it does not store unnecessary copies of T. (This might very well be a naive question, with an obvious answer, but I’m not super knowledgeable about the inner workings of autograd )

Yes, Autograd should figure this out.
Here is a small example, which shows that leaf variables are treated in a special way. I.e. inplace modifications are now allowed, if the value of the variable is needed to calculate the gradient:

``````# Both require gradients

out = x * y
out.backward()

# Modify inplace

y[0] = 2 # error
out = x * y
out.backward()
> RuntimeError: leaf variable has been moved into the graph interior

# Modify inplace of constant

y[0] = 2 # works
out = x * y
out.backward()
``````

While the inplace modification works for “constant” tensors.

1 Like

I see. And the following also fails:

``````x = torch.randn(1, requires_grad=True)