I am training a MobileNetV3 + DeepLabV3 large (taken directly from Pytorch)
opt_model = torchvision.models.segmentation.deeplabv3_mobilenet_v3_large(
progress=True,
num_classes=1,
).cuda()
I have a custom dataset I am using for body segmentation. When increasing batch size from 32 images to 256, the time it takes to go through one epoch increases. I tested to see which part of the iterations in the epoch took most of the time and it was these 2 :
scaler.scale(loss).backward() # type: ignore
scaler.step(optimizer)
I am not sure what’s going on. My assumptions are that with a bigger batch size, the faster it will be to go through 1 epoch.
Solution was found
The default pytorch model I was using has a bottleneck