How does autograd(f,model.parameters(),grad_outputs) work?

Hi, I am new to the autograd() as I have mostly been dependent on loss.backward().

x = torch.tensor([1.0,2.0],requires_grad = True)
W = torch.tensor([[1.0,1.0],[1.5,1.2]],requires_grad = True)
f1 = W@x
print(torch.autograd.grad(f,W,grad_outputs = torch.tensor([1.,1.])))
throws an error “One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.” Puting allow_unused = True, returns None.

Now, if I do this -
lin = nn.Linear(2,2,bias = False)
lin.weight = nn.Parameter(W)
f = lin(x)
print(torch.autograd.grad(f,lin.parameters(),grad_outputs = torch.tensor([1.,1.])))
I get the correct output [[1.,2.],
[1.,2.]]

So, my first question is what changes in the two cases?
I initially thought that may be in the second case, the calculation is broken down to something like this -

x = torch.tensor([1.0,2.0],requires_grad = True)
W = torch.tensor([[1.0,1.0],[1.5,1.2]])
for i in range(W.shape[0]):
temp_W = W[0].T
temp_W.requires_grad = True
autograd.grad(x,temp_W)
but here, we don’t even need the grad_outputs. However, if I remove grad_outputs while calling autograd with lin.parameters(), it shows error “grad can be implicitly created only for scalar outputs”.

Can someone please explain what am I missing?

You are using print(torch.autograd.grad(f,W,grad_outputs = torch.tensor([1.,1.]))) where f is undefined:

x = torch.tensor([1.0,2.0],requires_grad = True)
W = torch.tensor([[1.0,1.0],[1.5,1.2]],requires_grad = True)
f1 = W@x
print(torch.autograd.grad(f,W,grad_outputs = torch.tensor([1.,1.])))
# NameError: name 'f' is not defined

Fixing it shows a valid gradient:

x = torch.tensor([1.0,2.0],requires_grad = True)
W = torch.tensor([[1.0,1.0],[1.5,1.2]],requires_grad = True)
f1 = W@x
print(torch.autograd.grad(f1, W,grad_outputs = torch.tensor([1.,1.])))
# (tensor([[1., 2.],
#         [1., 2.]]),)