Calculation of activation maps gradients

I have a 4D tensor of activation maps, i.e., X of size (bs, channels, dim, dim); e.g.,

import torch
bs = 3
channels = 512
dim = 64
X = torch.rand(bs, channels, dim, dim)

I want to calculate the (x, y)-gradients of the activation maps (which are roughly seen as “images”). I think that this can be done using a 2D convolution with fixed weights. For the x-gradient, for instance,

import numpy as np
import torch.nn as nn

grad_x_weights = np.array([[1, 0, -1],
                           [2, 0, -2],
                           [1, 0, -1]])
conv_x = nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=False)
conv_x.weight = nn.Parameter(torch.from_numpy(grad_x_weights).float().expand(1, 512, 3,3))

grad_x = conv_x(X)

This will (as expected) give an output of size (3, 1, 64, 64), but what I would like to have is the gradient of the activation maps, for each activation map, so something of size (3, 512, 64, 64) in a way that grad_x[i, j, :, :] will have the x-gradient of the j-th activation map of the i-th input.

Finally, I would like to set grad_x_weights as non-learnable, so that conv_x will always calculates the gradients of the activation maps.

Thank you!

You could probably use the groups parameter:

conv_x = nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=False, groups=512)
conv_x.weight = nn.Parameter(torch.from_numpy(grad_x_weights).float().expand(512, 1, 3,3))

grad_x = conv_x(X)

To freeze the the weights you could set their requires_grad flag to False.