Hi @liangstein!
I hope this image helps. Its from my lecture notes and shows how a feature map is produced from an image I with a kernel K. With each application of the kernel a dot product is calculated, which effectively is the sum of element-wise multiplications between I and K in an K-sized area within I.
You could say that kernel looks for diagonal features. It then searches the image and finds a perfect matching feature in the lower left corner. Otherwise the kernel is able to identify only parts the feature its looking for. This why the product is called a feature map, as it tells how well a kernel was able to identify a feature in any location of the image it was applied to.