How do you choose the right Batch Size?

Naive question.

How do you know that the batch size you have selected is the right size w.r.t a gpu while training ML models? I am training a model and I think the batch is too less. If I increase the memory to almost double, it goes out of memory. So I know the sanity is somewhere in between to optimize for speed.

How much I can increase it to optimize the memory? The volatile gpu util is 52% and memory usage 3484MiB / 12212MiB.

Thanks in advance.

I don’t think there’s a straighforward way to set the batch size. One obviously aims to maximize the batch size to improve the raw processing performance. So you increase until you ran out of memory and dial it back again.

Not that with increasing (decreasing) batch size, you might also increase (decrease) the learning rate. Since you average the gradients over the whole badge, larger batches tend to have smaller gradients.