Methods to extract features from CNN

gktejus · June 2, 2020, 8:30am

I’m dealing with an encoder-decoder network where the decoder network is followed by a fully-connected layer and a linear layer (output). However, before flattening the features from the decoder, I’ve seen people do either of two things,i.e either use a single nn.AdaptiveAvgPool2d() with an output size of say (7,7), flatten it and pass it onto the fully-connected layer or use nn.AdaptiveAvgPool2d() and nn.AdaptiveMaxPool2d() each with an output size of (1,1), concatenate them and flatten them and pass it onto the fully-connected layers.I’m not sure what are the benefits/disadvantages of both the methods are and why they are done so?