Global Average Pooling Implementation

Any differences between nn.AvgPool2d(kernel_size = 8) and nn.AvgPool2d(kernel_size = 8, stride = 1) ?

The default stride value is equal to the kernel_size. So, in general, yes there is a difference.

However, for global average pooling it probably doesnt make a difference because you’re averaging values over the entire input feature map to produce a single value.

