Time and memory complexity

Suppose I have 9 images that have to be processed by a neural network NN. I have two choices.

  • Sequentially process 9 images by NN one after the another with batch dim 1.
  • Stack all images along the batch dimension i.e batch size becomes 9.

Certainly, the output for each image remains the same in the two cases but which is better in terms of execution time and memory consumption.

Does the answer vary w.r.t to execution on CPU and GPU?

Here I am just concerned about inference statistics i.e NN will run with torch.no_grad() wrapper.

Usually we prefer to use convenient batch-size from 8 to 32 and sometimes even more, but it can affect training.

In case of GPU usage with large batch size you can run into VRAM limitations.

This is because for each training sample we have to store some tensors, which will be used during backpropogation, and hence amount of memory often depends linearly on the batch size (if you are working on Linux, for example Ubuntu, you can test VRAM consumption during training by calling watch nvidia-smi in terminal).

To the question of time : you can test both perfomances using simple %%timeit function, in case you are working in Jupyter Notebook (as I do), otherwise, just use time() function and log the results.

Now you can try to research a little, and please, report back when you’re done :slight_smile:

My predictions is that case with batch size = 9 will be slightly faster than the other one.

Thankyou indeed batch=9 will be memory heavy. And in question i just meant the inference statistics and training can be ignored. I have edited the question.