Hello! The following is the architecture for “MobileNet”:
The input size DOES get halved when going from the row 4 conv_dw / s2
to the row 5 conv / s1
layer (112 x 112 x 64) to (56 x 56 x 64).
I don’t understand why the input size for the last conv / s1
layer is 7 x 7 x 1024 even though the layer before it (conv_dw / s2
layer) has a stride of 2. Shouldn’t the input size for the last conv / s1
layer be
3 x 3 x 1024 instead?