Is it better to set batch size as a integer power of 2 for torch.utils.data.DataLoader

Hi, guys,
I have heard that it would be better to set batch size as a integer power of 2 for torch.utils.data.DataLoader, and I want to assure whether that is true.

Any answer or idea will be appreciated!

Powers of two might be more “friendly” regarding the input shape to specific kernels and could perform better than other shapes (internally padding could be used, if it could yield an overall speedup).
However, it depends on your actual model, input shapes, etc. so you could profile different shapes and check for performance cliffs.

Hi,
specifically, for resnet, is the “kernel” mentioned denote the kernel_size in torch.nn.Conv2d?

Powers of two could be preferred in all dimensions, so number of channels, spatial size etc.
However, as described before, internally padding could be used, so that you wouldn’t hit a performance cliff and should thus profile your workloads.

1 Like

Get it.
It means internally padding is a internal mechanism of PyTorch, so that commonly, it is not that requisite for us to consider setting a 2-power batch size, am I right?

It can be used by cudnn and I don’t think that native PyTorch kernels use it (at least I haven’t seen it, but might be wrong).

OK, thanks sincerely for your answer!