Hello,
I find that the doc and the topics around autograd.grad are very unclear. Especially concerning the Jacobian-vector product.
Here is what i struggle with:
I need to compute the Jacobian-vector product vJ for one layer of neural network.
Say f(x, w) is a parametric function from R^(2xn)xR^2 in R^n (n being the batch size, R^2 the parameter space, or hypothesis as it’s sometimes called among computer scientists) and v is in R^2. I want to compute v^\top J_f(x, w) where x is in R^(2xn) and w in R^2.
From what i tried to understand from the doc, it suffices to call autograd.grad(f(x, w), w, grad_outputs = v).
But I get an error:
Mismatch in shape: grad_output[0] has a shape of torch.Size([1, 2]) and output[0] has a shape of torch.Size([n, 1]).
-> How can the output shape be relevant here ? I want to multiply the Jacobian by v, and the shape of the Jacobian is (n, 2).
-> As addition, you should specify clearly what is the operation that is applied to v when we give the argument grad_outputs=v to autograg.grad …
Here is the code to reproduce the error:
import torch
import torch.nn as nn
class example_net(nn.Module):
def __init__(self):
super(example_net, self).__init__()
self.linear1 = nn.Linear(2, 1, bias = False)
def forward(self, x):
x = self.linear1(x)
x = torch.sigmoid(x)
return(x.flatten())
criterion = nn.BCELoss(reduction = 'none')
exnet = example_net()
x = torch.randn(100, 2)
w = exnet.linear1.weight
y = torch.randn(100)
u = torch.randn(2)
losses = criterion(exnet(x), y)
torch.autograd.grad(losses, w, grad_outputs=u)
RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([2]) and output[0] has a shape of torch.Size([100]).
[EDIT] Corrected the output dimension in the definitions.