Incorrect torchvision implementation of ResNeXt

If I understand correctly, resnext50_32x4d means resnet50 with 32 groups (cardinality) and each group has 4 channel outputs (bottleneck width). However, the torchvision implementation has 4 groups with 32 channel outputs.

Can someone help me confirming this?

That’s also my understanding, cardinality (‘C’) == groups in the bottleneck as per the original facebook Torch impl. So as implemented there that would be a 4x32d resnext.