Pytorch JVP slow

pomonam · September 3, 2020, 2:07am

Using functional.jvp (https://pytorch.org/docs/stable/autograd.html) and taking gradient with respect to it is 2 times slow in general and for larger networks (ResNet-20), it is 4 times slower. Are there any known issues on using JVP? Are there any methods to make it more efficient (maybe using other packages)?

albanD · September 3, 2020, 2:46am

Hi,

As mentioned in the note on the function definition, the function uses the “double backward trick” to compute the jvp and is thus expected to be slower.
We are working on changing that for 1.7.

pomonam · September 3, 2020, 3:06am

Thank you for the reply! That is great to hear . When do you expect 1.7 to be released?

albanD · September 3, 2020, 2:51pm

It should be early November IIRC.

InfoMax · October 23, 2020, 10:05am

Are these changes already available in pytorch-nightly?

albanD · October 23, 2020, 2:43pm

Hi,

I am afraid this got delayed and won’t be in 1.7.
But they should get into master soon after we are done working on the release (mid/end November)

InfoMax · December 2, 2020, 10:57am

Hi,

Thank you for your answer. Is there an issue I can follow to know when this is merged into master?

albanD · December 2, 2020, 2:21pm

Sure, you can follow https://github.com/pytorch/pytorch/issues/10223 that contains most of the discussions around this.