I find that the doc and the topics around autograd.grad are very unclear. Especially concerning the Jacobian-vector product.
Here is what i struggle with:
I need to compute the Jacobian-vector product vJ for one layer of neural network.
Say f(x, w) is a parametric function from R^(2xn)xR^2 in R^n (n being the batch size, R^2 the parameter space, or hypothesis as it’s sometimes called among computer scientists) and v is in R^2. I want to compute v^\top J_f(x, w) where x is in R^(2xn) and w in R^2.
From what i tried to understand from the doc, it suffices to call autograd.grad(f(x, w), w, grad_outputs = v).
But I get an error:
Mismatch in shape: grad_output has a shape of torch.Size([1, 2]) and output has a shape of torch.Size([n, 1]).
-> How can the output shape be relevant here ? I want to multiply the Jacobian by v, and the shape of the Jacobian is (n, 2).
-> As addition, you should specify clearly what is the operation that is applied to v when we give the argument grad_outputs=v to autograg.grad …
Here is the code to reproduce the error:
import torch import torch.nn as nn class example_net(nn.Module): def __init__(self): super(example_net, self).__init__() self.linear1 = nn.Linear(2, 1, bias = False) def forward(self, x): x = self.linear1(x) x = torch.sigmoid(x) return(x.flatten()) criterion = nn.BCELoss(reduction = 'none') exnet = example_net() x = torch.randn(100, 2) w = exnet.linear1.weight y = torch.randn(100) u = torch.randn(2) losses = criterion(exnet(x), y) torch.autograd.grad(losses, w, grad_outputs=u)
RuntimeError: Mismatch in shape: grad_output has a shape of torch.Size() and output has a shape of torch.Size().
[EDIT] Corrected the output dimension in the definitions.