Is there any way to calculate a partial derivative (for instance for the slice of a tensor) in pytorch using autograd?
Background:
I am attempting to compute the hessian of convolutional layer weights wrt the loss. However, computing the hessian for the complete tensor holding the weights for all convolution kernels is not tractable. I would therefore like to calculate the hessian for each convolution kernel, which is significantly less complex.
I would therefore like to calculate the hessian for each convolution kernel, which is significantly less complex.
While this is mathematically less complex, the way that convolution layers are implemented in every framework is that the compute the full forward (or backward-grad / backward-weight) as an optimized function. There isn’t the ability to ask to only compute backward-weight wrt kernel 34 (for example).
So if you get y = kernel[34] and ask for grad wrt y, what will likely happen is that it’ll compute the full grad, and then zero out everything else and give you slice corresponding to y