How to set gradients of some elements inside a convolution kernel always equal to 0?

Hello everyone,

I am now trying to implement a custom convolution kernel.
Let me take dilated convolution as example. (I know there are dilated conv in Pytorch. Just an example to state my question).
The kernel shold be something like:

[a, 0, b, 0, c ],
[0, 0, 0, 0, 0 ],
[d, 0, e, 0, f  ],
[0, 0, 0, 0, 0 ],
[g, 0, h, 0, i  ],

Browsing others discusions, I know the way to construct such kernel, and then use F.conv2d.

But when calculating the gradients, those 0 values in the kernel still have gradients. I just want them to be ‘0’ all the time and never update. Only [a, b, c, d, e, f, g, h, i] can be updated.

How can I do this?

I know one way to do it is that manually set weights.grad[index of 0] = 0. But suppose the model has a lot of kernels of this kind, it will not be efficient.


I think zeroing out the gradients is a valid approach and you could register hooks to perform this operation, if this would be more convenient, via param.register_hook.
Alternatively, you could also create a trainable nn.Parameter containing the values of [a-i] and recreate the kernel in each iteration e.g. by using, torch.scatter, or torch.gather (depending on the actual shape). However, I would prefer the first approach as it sounds simpler and cleaner.

That makes sense. Thx!