Question about EfficientDET layer input / output size mismatch

I have a question about EfficientDET model here. Why does the AdaptiveAVG2D layer have am output size of (32, 1280, 1, 1), yet the input size of the succeeding classifier layer is (32, 1280)? I thought that the two had to match, or is there some kind of flattening that was done internally?

The code that generated the figure is below:

weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT weights
model = torchvision.models.efficientnet_b0(weights=weights).to(device)
summary(model=model,
input_size=(32, 3, 224, 224),
col_names=[“input_size”, “output_size”, “num_params”, “trainable”],
col_width=20,
row_settings=[“var_names”]
)