How to Pass gradients from the softmax to the Convolution/Maxpool layer?

Hi, I am trying to implement a CNN in pure python to understand how the magic happens.

I have input a set of RGB images, 32 x 32 in size.

I made a convolutional filter that converts this 1 x 3 x 32 x 32 vector

to 1 x 1 x 32 x 32 and then I apply a maxpool layer which makes the size 1 x 1 x 16 x 16 .

My question is how do I classify by applying a softmax now?

How can I pass gradients if I apply softmax on the 16 x 16 dimensional vector?

I think you can derive the backprop equation for softmax part. For maxpool layer, set the activations to 0 for non-max values. For example, if d_out = 10, so in maxpool pass the value 10 as gradient for the maximum value and 0 for others.
For backprop through conv layers, you need to do deconv operation.

what is the backprop eqyation for softmax part, could you give an example here, how can i defaltten the gradients?

In your forward pass out = out.view(-1) to flatten output to 1D.
In backward pass d_out = d_out.view(1, 1, 16, 16)

For backpropagation, do a quick google search you will find the complete derivation. I don’t have the picture of it right now.

2 Likes