Optimizer.step() is very slow

I am training a Densely Connected U-Net model on CT scan data of dimension 512x512 for segmentation task.
My network training was very slow, so I tried to profile the different steps in my code and found the optimizer.step() line to be the bottleneck. It is extremely slow and takes nearly 0.35 secs every iteration. The time taken by the other steps is as follows:

My optimizer declaration is: optimizer = optim.Adam(model.parameters(), lr=0.001)
I cannot understand what is the reason. Can someone suggest some possible causes?

Thanks :slight_smile:

1 Like

The relation between backward and optimize seems unrealistic: Are you on cuda? Did you use torch.cuda.synchronize() each time you start or stop the timer?
I have to ask because it’s a very common cause of invalid performance analyses.

Best regards


1 Like

Thanks for your reply. Sorry I did not.

Here are the results of the run after adding torch.cuda.synchronize():

Now loss.backward() is the slowest step. I can understand forward calculation and loss.backward having similar time complexity, but what’s the reason for optimize step being ~30x faster? Are all the weight updates doing simultaneously, while loss.backward is sequential?

optimizer.step() is 30x faster because it’s a very cheap / small operation (usually linear in number of parameters), compared to the neural network itself (which usually has matmul / convolution, which is quadratic or more in number of parameters)