Hello! The following is the architecture for “MobileNet”:
The input size DOES get halved when going from the row 4
conv_dw / s2 to the row 5
conv / s1 layer (112 x 112 x 64) to (56 x 56 x 64).
I don’t understand why the input size for the last
conv / s1 layer is 7 x 7 x 1024 even though the layer before it (
conv_dw / s2 layer) has a stride of 2. Shouldn’t the input size for the last
conv / s1 layer be
3 x 3 x 1024 instead?