Right; so seems pretty dead simple
- but I might still be getting myself confused.
The paper says:
the deconvnet uses transposed versions of the same filters, but applied to the rectified maps, not the output of the layer beneath. In practice this means flipping each filter vertically and horizontally.
“the same filters” refer to the learnt filters from training.
The paper refers to the flipped vertically and horizontally filter as transposed ... filters
Is it correct that the term transposed filters
has no semantic relation to the term transposed convolution
?
Appreciated this is not pytorch related; so completely understand if its too far off topic for this forum; but any easy to share pointers would be fab!!
The paper says:
In the deconvnet, the unpooling operation uses these switches to place the reconstructions from the layer above into appropriate locations,
So if the “layer above” had an output of
[
[1,2],
[3,4]
]
And if; on the forward pass; before the pooling operation the layer below
shape was 4x4.
Then instead of using “stride” to introduce spaces between the activations - would the process in the paper just place the activations in the original locations recorded on the forward pass; so for example could end up with something like
[
[1,0,0,0],
[0,0,2,0],
[0,3,0,4],
[0,0,0,0]
]
And then the convolution operation would actually be a normal convolution as the resolution of the features has already been increased; and would convolve the transposed filters across the above feature map?