I try to refer to this tutorial Fusing Convolution and Batch Norm using Custom Function — PyTorch Tutorials 1.13.0+cu117 documentation to practice module fusion in my personal project. But I found it difficult to use customized stride and padding settings as this may lead to shape being unmatched between X and grad_X. I try to add padding to the gradient matrix but sometimes the shape difference between X and grad_X is not an even number and it is tricky to add padding. Any idea is highly appreciated. Thanks.

Yes the backward implementation of conv2d does not support different strides and padding, instead you can probably use the convolution backwards implemented in torch/nn/grad.py

I made a minor change to the code, trying to make it adapt to different padding in the `convolution_backward`

function:

```
if X.size() != grad_X.size():
diff_xaxis = X.size(-2) - grad_X.size(-2)
diff_yaxis = X.size(-1) - grad_X.size(-1)
p2d = (diff_yaxis >> 1, diff_yaxis - (diff_yaxis >> 1), diff_xaxis >> 1, diff_xaxis - (diff_xaxis >> 1))
grad_X = F.pad(grad_X, p2d, "constant", 0.)
if weight.size() != grad_input.size():
diff_xaxis = weight.size(-2) - grad_input.size(-2)
diff_yaxis = weight.size(-1) - grad_input.size(-1)
p2d = (diff_yaxis >> 1, diff_yaxis - (diff_yaxis >> 1), diff_xaxis >> 1, diff_xaxis - (diff_xaxis >> 1))
grad_input = F.pad(grad_input, p2d, "constant", 0.)
```

And for all conv and transpose conv, include the padding as well.

Not sure if this is correct. But I found it slower with resnet50.