Backward time consumption linearly increases with batch_size

15271887443 · March 26, 2024, 6:46pm

When I am training a simple resnet which involves convolutional network forward and backward on celebA dataset, I find out that the loss.backward() (MSE loss) whose time consumption increases linearly with batch_size. A simple profile is as follows:

Here is a part of backward profiling. The batch_size is set to 128, and GPU consumption is about 1.5GB/12GB.

15271887443 · March 26, 2024, 6:47pm

Here is another backward profiling. The batch_size is set to 512, and GPU ram consumption is about 4.7GB/12GB.

I am sure the bottleneck is the loss.backward(). However, neither the CPU or GPU is fully used, it is weird why the backward time consumptio gets a linear increment. The parallel processing seems not working here.