I have a 4D tensor of activation maps, i.e., `X`

of size `(bs, channels, dim, dim)`

; e.g.,

```
import torch
bs = 3
channels = 512
dim = 64
X = torch.rand(bs, channels, dim, dim)
```

I want to calculate the (x, y)-gradients of the activation maps (which are roughly seen as “images”). I think that this can be done using a 2D convolution with fixed weights. For the x-gradient, for instance,

```
import numpy as np
import torch.nn as nn
grad_x_weights = np.array([[1, 0, -1],
[2, 0, -2],
[1, 0, -1]])
conv_x = nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=False)
conv_x.weight = nn.Parameter(torch.from_numpy(grad_x_weights).float().expand(1, 512, 3,3))
grad_x = conv_x(X)
```

This will (as expected) give an output of size `(3, 1, 64, 64)`

, but what I would like to have is the gradient of the activation maps, for *each* activation map, so something of size `(3, 512, 64, 64)`

in a way that `grad_x[i, j, :, :]`

will have the x-gradient of the j-th activation map of the i-th input.

Finally, I would like to set `grad_x_weights`

as non-learnable, so that `conv_x`

will always calculates the gradients of the activation maps.

Thank you!