Im currently working on implementing Conv2d from scratch without using autograd. So i already got the forwardpass working using unfold. But when it comes to the backwardpass something goes wrong. As i understood it after you unfold your input and kernel the whole steps for the forwardpass and backwardpass are basically the same as in normal linear networks. But if i try to implement it like this i get weird shapes. Heres the code im currently using.
import torch # as an input im using a tensor with the size of a mnist digit img = torch.randn(1 ,1 ,28 ,28) # kernel with 1 input dim and 1 output dim kernel = torch.randn(1 ,1 ,3 ,3) # unfold with kernel size 3 and padding 1 unfold = torch.nn.Unfold(kernel_size=(3,3) ,padding=1) # unfolded data img_unfolded = unfold(img) # forwardpass with kernel and unfolded data # reshape kernel to (out_dim ,-1) output = kernel.view(1 ,-1).matmul(img_unfolded) # output with shape (1 ,1 ,784) (then youd reshape to 1,1,28,28) # to have the right dimensions for the next layer
now assume that the gradients of the outputlayer are already computed as randn(1 ,1 ,784)
# gradients already calculated and with shape of the output layer grads_output = torch.randn(1 ,1 ,784) # so now how to i get to the gradients with respect to the weights? # I tried it with img_unfolded transposed # times output but the shape is just weird grads_weights = img_unfolded.T.matmul(output) # shape is (784, 9, 784) which is clearly wrong becuase the # right shape would be (1,9)
but when i try to compute the gradients of the next layers output the result is as im expecting it
# transposed reshaped kernel times the gradients of the output grads_next_layer = kernel.view(1 ,-1).T.matmul(grads_output) # as said it has the right shape of (1 ,9 ,784)
so my question is how do i correctly calculate the gradients with respect to the weights?
thanks for taking your time to read this and have a nice day