the network input is 3x112x112 (RGB, x,y), and zero padding (3,3), and filter is 7x7, stride 2.
then, the network will actually perform a 7x7 filter, stride 2 operation on the 3x118x118. but, maybe, last 1 padding remains.
what happen?
output is 56x56? or 57x57?
Output will be 56x56.
import torch
m = torch.nn.Conv2d(in_channels=3, out_channels =3, kernel_size=7, stride=2, padding=3)
input = torch.randn(1,3,112,112)
output = m(input)
output.shape
# torch.Size([1, 3, 56, 56])
yes, i get same result.
but, the remain 1 zero padding is where?
not used?
Have a look at the last animation here.
One side of the padding is used, while the other will be dropped.
2 Likes