Gradient with respect to update in model weights

danthekang · March 11, 2019, 1:00am

This is a paraphrased question of this previous post

Assume we define a simple network of one linear layer. And we have an update to network weights. Is it possible to compute gradient with respect to that update?

class SimpleNet(torch.nn.Module):
    def __init__(self, D_in, D_out):
        super(SimpleNet, self).__init__()
        self.linear = torch.nn.Linear(D_in, D_out, bias=False)
    
    def forward(self, x):
        return self.linear(x)

# Create a network
num_sample, dim_in, dim_out = 2, 5, 1
net = SimpleNet(dim_in, dim_out)

# Create some variable we want to take derivative with
eps = Variable(torch.randn(dim_out, dim_in), requires_grad=True)

# Update model parameter with eps
name, param = list(net.named_parameters())[0]
param.data += eps

x = torch.randn(num_sample, dim_in)
some_loss = torch.sum(net(x))

# This does not work
torch.autograd.grad(some_loss, eps)