Back prorogation from fc layer to conv layer

flipflop · November 11, 2020, 4:33pm

I am trying to manually derive and update the gradient for a simple nn .
input size is 12*12 ,apply 10*3*3 kernel ,so i have a conv layer with output features 10*10*10 ,and by flattening it to 1*1000 to create one dimension vector to connect to a fully connected layer with weight dimension 1000*2. I can get the fc layers weight to be updated , but how do I update weight value from 1000*2 matrix to 10*3*3 kernel matrix

Kushaj · November 11, 2020, 8:12pm

Your loss function contains average. When you backprop from loss to fc layers you will get a scaler value (a vector in case of batching) due the the average. During calculation of gradients of conv layer you will multiply the previous scaler with the grads of conv layer.

If you derive the equations then it will become clearer. If something needs more clarification, feel free to ask.

flipflop · November 12, 2020, 3:25am

Hi kushaj ,thank you for your answer , but i am still confused , if back propagate from loss to fc layer weight ,which here is 10002 , how do we get a scalar ? shouldnt be still 10002 weight matrix with all the weight value updated ?

Kushaj · November 12, 2020, 11:03am

I think the confusion is due to the limitation of the text. Check the derivation below and I am happy to help with any more questions.

Note:- To implement in pytorch, you need to respect the dimensions (like adding squeeze and unsqueeze where necessary).