How to compute the finite difference Jacobian matrix

marcodena · February 23, 2021, 1:11pm

Dear community,
I need to compute the (differentiable) Jacobian matrix of y = f(z), where y is a (B, 3, 128, 128) aka a batch of images, and z is a (B, 64) vector.

Computing the Jacobian matrix through the pytorch functionial jacobian (Automatic differentiation package - torch.autograd — PyTorch 2.1 documentation) Is too slow. Thus, I am exploring the finite difference method (Finite difference - Wikipedia), which is an approximation of the Jacobian. My implementation for B=1 is:

def get_jacobian(net, z, x):
    eps = torch.rand((x.size(0), ), device=x.device)
    delta = torch.sqrt(eps)

    x = x.view(x.size(0), -1)
    m = x.size(1)
    n = z.size(1)

    J = torch.zeros((m, n), device=x.device)
    I = torch.eye(n, device=x.device)

    for j in range(n):
        J[:, j] = (net(z+delta*I[:, j]).view(x.size(0), -1) - x) / delta

    return J

x_fake = mynetfunction(z)
J = get_jacobian(mynetfunction, z, x_fake)

However, it requires a lot of memory. Is there a way to make it better?

albanD · February 24, 2021, 4:41pm

Hi,

I am not sure you are generating your eps correctly: rand is a uniform in [0, 1], not a normal.
Also the value in the formula is for a variance close to 0.

We actually use this to check our gradients here pytorch/gradcheck.py at master · pytorch/pytorch · GitHub
The main things are:

Always use double precision otherwise the precision is just really bad
Use delta=1e-6 for double precision. No need to actually do sampling
You can do centered difference if you need more precision: f(z + e_t * eps/2) - f(z - e_t * eps/2). Even even more precise with an additional point at the center. You can check the wikipedia page for the exact formula you need to use in this case.
Don’t create the full I matrix. Just one vector and write a 1 at the proper index into it.
Disable autograd if you’re using nn.Module to avoid extra allocation with @torch.no_grad()