Hello,

I have been trying to implement a custom Conv2d module where grad_input (dx) and grad_weight (dw) are calculated by using different grad_output (dy) values. I implemented this by extending torch.autograd as in Pytorch tutorials.

However I am confused by the information in this link.

- Is extending the autograd.Function not enough?
- What is the difference between writing a new autograd function in Python vs C++?
- How about the CUDA implementations in
`/torch/nn/blob/master/lib/THNN/generic/SpatialConvolutionMM.c`

where dx and dw calculated? Should I change them too?

Here is my custom function:

```
class myCustomConv2d(torch.autograd.Function):
@staticmethod
def forward(ctx, x, w, bias=None, stride=1, padding=0, dilation=1, groups=1):
ctx.save_for_backward(x, w, bias)
ctx.stride = stride
ctx.padding = padding
ctx.dilation = dilation
ctx.groups = groups
out = F.conv2d(x, w, bias, stride, padding, dilation, groups)
return out
@staticmethod
def backward(ctx, grad_output):
input, weight, bias = ctx.saved_tensors
stride = ctx.stride
padding = ctx.padding
dilation = ctx.dilation
groups = ctx.groups
grad_input = grad_weight = grad_bias = None
dy_for_inputs = myspecialfunction1(grad_output)
dy_for_weights = myspecialfunction2(grad_output)
grad_input = torch.nn.grad.conv2d_input(input.shape, weight, dy_for_inputs , stride, padding, dilation, groups)
grad_weight = torch.nn.grad.conv2d_weight(input, weight.shape, dy_for_weights , stride, padding, dilation, groups)
if bias is not None and ctx.needs_input_grad[2]:
grad_bias = dy_for_weights .sum((0,2,3)).squeeze(0)
return grad_input, grad_weight, grad_bias, None, None, None, None
```