Optimizer.step() is very slow

chinmoy_samant · December 25, 2018, 8:34am

I am training a Densely Connected U-Net model on CT scan data of dimension 512x512 for segmentation task.
My network training was very slow, so I tried to profile the different steps in my code and found the optimizer.step() line to be the bottleneck. It is extremely slow and takes nearly 0.35 secs every iteration. The time taken by the other steps is as follows:

.
My optimizer declaration is: optimizer = optim.Adam(model.parameters(), lr=0.001)
I cannot understand what is the reason. Can someone suggest some possible causes?

Thanks

tom · December 25, 2018, 6:41pm

The relation between backward and optimize seems unrealistic: Are you on cuda? Did you use torch.cuda.synchronize() each time you start or stop the timer?
I have to ask because it’s a very common cause of invalid performance analyses.

Best regards

Thomas

chinmoy_samant · December 26, 2018, 6:20am

Thanks for your reply. Sorry I did not.

Here are the results of the run after adding torch.cuda.synchronize():

Now loss.backward() is the slowest step. I can understand forward calculation and loss.backward having similar time complexity, but what’s the reason for optimize step being ~30x faster? Are all the weight updates doing simultaneously, while loss.backward is sequential?

smth · December 28, 2018, 11:26pm

optimizer.step() is 30x faster because it’s a very cheap / small operation (usually linear in number of parameters), compared to the neural network itself (which usually has matmul / convolution, which is quadratic or more in number of parameters)