I have an image with the size (1,32,32) and I’m using Conv2D with kernel size of 5x5 and stride 1, which means it gives (28,28) after the convolution.

I want to get the values of multiplication between the kernel weights(5x5) and the inputs(5x5 in every step of all 28x28 steps) before the method sums up all the 25 elements together.

I thought about unfolding the source image and using a conv2d with stride 5 - but again it sums all the 5x5 elements.

Thank you, I did as you suggested.
I have a follow up question - most of the time I try to stick to the torch.nn.\torch.method because I know it will calculate the gradients right. Here I tried to do something a little bit unconventional like using unfold on input and then use different activation function on different patches of the unfolded image.
Or for example using unfold on the hidden layer.
Is there a way to make sure it calculates the gradients right?
I’m familiar with .requires_grad and autograd but I’m not sure if it calculates the gradients as I would thought.
Is there a guide that can help me get intuition about how the gradients are calculated in Pytorch?
Do I need to look explicitly at the gradients and compare it to analytical results?

unfold is differentiable so won’t detach the tensor from the computation graph.
To make sure that’s indeed the case, you could check the .grad_fn of the output tensor of any operation and see if it points to a valid function (returning None would mean the tensor is not attached to a computation graph).
To check the correctness of the gradient calculation you could use gradcheck.