How does the gradient of F.conv2d look like?

shuairenqin · August 29, 2018, 10:20am

In recent days, I have tried customizing functional conv2d. There are two confusions in my heart.
(1) Pytorch regards F.conv2d as a function (operation). So, it must have a forward and backward method.
But, I do not find them in the document. Where can I find them?
(2) The gradient of vector function is a Jacobi matrix. The output of F.conv2d is a matrix with high dimensions.
How does its gradient with respect to weight and bias look like (beyond my imagination)?

SimonW · August 29, 2018, 9:49pm

What the backward computes really is a vector Jacobian product, rather than the full Jacobian, for efficiency reasons, and also for the fact that most people really only care about the gradients of parameters w.r.t. a single scalar metric.

The current conv2d CPU backward is here https://github.com/pytorch/pytorch/blob/22e3b2c9c369c5fb44476eb538fa0a308df94eff/aten/src/THNN/generic/SpatialConvolutionMM.c#L247-L414