Hi community
I am using 3D Conv layers in a network where the inputs are stacked images of a subject at 3 points in time. Across the network I keep the depth at 3 although the spatial dimensions of the images are reduced by pooling layers. Visualizing the feature maps for each activation, I was expecting to see patterns relating to each of the 3 slices individually, but it looks like all feature maps are superimpositions of the 3 slices.
For instance, on the feature map indexed depth 0 of the first convolution layer, I can see details that look like the input image at depth 3. Is that normal behavior and is my intuition about 3D conv layers completely wrong?
Thanks!