Is it better to set batch size as a integer power of 2 for torch.utils.data.DataLoader

songyuc · April 7, 2021, 5:05am

Hi, guys,
I have heard that it would be better to set batch size as a integer power of 2 for torch.utils.data.DataLoader, and I want to assure whether that is true.

Any answer or idea will be appreciated!

ptrblck · April 7, 2021, 9:15pm

Powers of two might be more “friendly” regarding the input shape to specific kernels and could perform better than other shapes (internally padding could be used, if it could yield an overall speedup).
However, it depends on your actual model, input shapes, etc. so you could profile different shapes and check for performance cliffs.

songyuc · April 14, 2021, 5:11am

Hi,
specifically, for resnet, is the “kernel” mentioned denote the kernel_size in torch.nn.Conv2d?

ptrblck · April 14, 2021, 5:14am

Powers of two could be preferred in all dimensions, so number of channels, spatial size etc.
However, as described before, internally padding could be used, so that you wouldn’t hit a performance cliff and should thus profile your workloads.

songyuc · April 14, 2021, 5:22am

Get it.
It means internally padding is a internal mechanism of PyTorch, so that commonly, it is not that requisite for us to consider setting a 2-power batch size, am I right?

ptrblck · April 14, 2021, 5:23am

It can be used by cudnn and I don’t think that native PyTorch kernels use it (at least I haven’t seen it, but might be wrong).

songyuc · April 14, 2021, 5:26am

OK, thanks sincerely for your answer!