Profiling backward for inplace operations

seliad · May 7, 2020, 6:55am

We want to profile backward time for in place operations, e g:
a *= b
For this we need to detach a, and set requires grad to True.
However then it’s a leaf variable modified inplace which autograd doesn’t allow.

Is there any workaround?

albanD · May 7, 2020, 2:55pm

Hi,

I’m not sure to understand your goal?
You want to benchmark the runtime of a single op ?

If so, you can do:

base = torch.rand(10, requires_grad=True)
a = base + 1 # The backward of this is an identity so as fast as it can get

a *= b

a.backward(a)

But note that you’re mostly going to measure the overhead of the autograd here.
You can use the autograd profiler though to get a more precise idea without the autograd overhead.

seliad · May 10, 2020, 10:45am

Yes we want to profile runtime of single ops, when it makes sense.
(and for everything in the computation graph, actually).

I assume that gradients for mul are already created in autograd and its just a matter of accumulating it (==copy? or something similar) in a leaf or not.
Maybe for div() (or some other other in-place ops) its more notable.