Hi, I am trying to implement a CNN in pure python to understand how the magic happens.
I have input a set of RGB images,
32 x 32 in size.
I made a convolutional filter that converts this
1 x 3 x 32 x 32 vector
1 x 1 x 32 x 32 and then I apply a maxpool layer which makes the size
1 x 1 x 16 x 16 .
My question is how do I classify by applying a softmax now?
How can I pass gradients if I apply softmax on the 16 x 16 dimensional vector?
I think you can derive the backprop equation for softmax part. For maxpool layer, set the activations to 0 for non-max values. For example, if d_out = 10, so in maxpool pass the value 10 as gradient for the maximum value and 0 for others.
For backprop through conv layers, you need to do deconv operation.
what is the backprop eqyation for softmax part, could you give an example here, how can i defaltten the gradients?
In your forward pass
out = out.view(-1) to flatten output to 1D.
In backward pass
d_out = d_out.view(1, 1, 16, 16)
For backpropagation, do a quick google search you will find the complete derivation. I don’t have the picture of it right now.