Hello! The following is the architecture for “MobileNet”:

The input size DOES get halved when going from the row 4 conv_dw / s2 to the row 5 conv / s1 layer (112 x 112 x 64) to (56 x 56 x 64).
I don’t understand why the input size for the last conv / s1 layer is 7 x 7 x 1024 even though the layer before it (conv_dw / s2 layer) has a stride of 2. Shouldn’t the input size for the last conv / s1 layer be
3 x 3 x 1024 instead?