Jacobian for RNN state wrt. previous state

Hi! I have RNN outputs h of shape [T, N], and I’m trying to compute the Jacobian of h(t+1) with respect to the previous state h(t) (the eigenvalues of such Jacobians are related to chaoticity of the dynamics).

Here’s the code:

def jacobian(h):
    """Compute the Jacobian of an RNN state vector h(t+1) with respect to h(t) for each t.

    :param h: shape [T, N]; output from an RNN
    :return: tensor of Jacobian matrices of shape [T-1, N, N]
    """
    T, N = h.shape
    J = []
    for t in range(T-1):
        J_t = []
        for n in range(N):
            J_t.append(grad(h[t+1, n], h[t]))
        J_t = torch.stack(J_t)  # [N, N]
        J.append(J_t)

    J = torch.stack(J)  # [T-1, N, N]

    return J

However, running this I get RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior. and trying with allow_unused=True just gives Nones as the gradients…

I guess I don’t really get the autograd that well… is something like this even possible?

EDIT: FYI I don’t need this to be differentiable… just want to look at the eigenvalues “post mortem”.

OK I’m thinking that instead of using all the already computed hs, I probably need to define both x(t) and h(t) as gradient-requiring inputs, then evolve to h(t+1) and take the gradient w.r.t. h(t) for each time step… can anyone confirm and/or help out a bit?