I have have a neural network that takes the input x of shape [batch, timesteps, x_features] and the input p of shape [batch, p_features]. The output is of shape [batch, timesteps, out_features].
What I want to calculate is the Jacobian of the output with respect to p. So the Jacobian should be of shape [batch, timesteps, out_features, p_features].
Notice that the differentiation should happen for the out_features at every timestep. I could reshape them such that I have an output of shape [batch, timestep*x_features] but I want to omit this because of the calculations that follow…
In practice I use the following values:
batch = 1
timesteps = 601
x_features = 20
p_features = 14
out_features = 15
I tried two things:
1st approach:
def partial_forward(p):
return model(x, p)
jacobian = autograd.functional.jacobian(partial_forward, p)
This gives me a Jacobian of shape [timesteps, out_features, batch, p_features] which does not seem correct and the calculation took also 16.5 seconds. One forward calculation usually takes 0.002sec, and I noticed that 1 x 601 x 14 x 0.002sec = 16.8 sec which seems a bit strange to me since I additionally have 15 out_features. Hence I tried a
2nd approach:
params = dict(model.named_parameters())
def fmodel(params, inputs):
return functional_call(model, params, inputs)
result = vmap(jacrev(fmodel, argnums=(2)), in_dims=(None, None, 0))(params, x, p)
which I took from AlphaBetaGamma96 answer:
> Blockquote
but here I run into the error ‘RuntimeError: jacrev: Expected all inputs to be real but received complex tensor at flattened input idx: 4’. This is because in my neural network model I use the Fourier Transform which I rely on. But I also use the Inverse Fourier Transform and thus my input and my output of the neural network are real. It is just that some of the weights are complex.
Is the 2nd approach the correct way for what I mean to calculate and is there a workaround for this RuntimeError regarding complex numbers?