As we know, a convolution layer is essentially a moving window or filter across the image being studied. This moving window applies to a certain neighborhood of nodes. This process is not a simple matrix operation, how to differentiate it?
Internally a matrix multiplication might still be faster for convolutions using im2col than a sliding window approach.
Have a look at Pete Warden’s blog post on this topic.
Thank you very much!