Fast.ai SimpleSelfAttention implementation in pre-trained ResNet-50 or VGG-16

Hi Team Pytorch,

I am pretty new to deep learning, and I am lost on how to implement SimpleSelfAttention layer to my existing pre-trained VGG-16 or ResNet-50 model.
SimpleSelfAttention is based on fast.ai code - GitHub - sdoria/SimpleSelfAttention: A simpler version of the self-attention layer from SAGAN, and some image classification results.. I am unable to wrap my head around this. I tried applying this to avgpool, was able to add, however it did not work.
Also, considering parameters are frozen, I have my own custom Fully-Connected layer, if I add this to let us say 40th layer will the network be trainable after 40th layer? Any help would be highly appreciated.

1 Like