Autograd.grad extremely slow

jshanna100 · February 27, 2020, 4:35pm

Does anyone have an intuition why backwards applied to autograd.grad would be several hundred times slower than the forward pass? This is with Pytorch 1.4 and cuda toolkit 10.1.243.

albanD · February 27, 2020, 6:09pm

Hi,

That would depend a lot on your forward pass. It should not do that
Do you have a code sample that repro this? If possible on colab (https://colab.research.google.com/notebook#create=true) so that we can easily test it.

jshanna100 · February 29, 2020, 12:19pm

And here’s a graph of the full network:

jshanna100 · March 1, 2020, 3:02pm

I tracked the problem to cudnn.benchmark. Only the first iteration takes very long, but if you wait for that to pass, the subsequent iterations take the correct amount of time.

albanD · March 2, 2020, 3:11pm

Ho, right. This is expected then. The cudnn.benchmark takes a very long time the first time it encounters a new input size (as it tests several algorithms).
If your models has many different convolutions with different input sizes, it will do that many times, which can be quite slow indeed.

Happy that you found the reason !