According to Google’s pytorch implementation of Big Data Transfer, there is subtle difference between the following 2 approaches. Could anyone explain the difference? Is it some different strategy for boundary pixels?
What’s the purpose of spliting
padding parameter from
nn.MaxPool2d and making it a separate
nn.Pad layer before the pooling?