TL;DR, does computing `torch.var(x)` on a 1D tensor of n values have a quadratic runtime with autograd? Or does autograd it have a linear runtime in n?

Longer version:
If I have n values in a 1D tensor, and I compute var within a nn.Module, will autograd compute an n by n Jacobian in quadratic runtime?

For example, if we have values
`x = torch.tensor([1.,2.,4.,8.,16.,32.,64.], requires_grad=True)`
then `y = torch.var(x)` will set `grad_fn` in `y`.

The variance can be conceptualized as computing all pairwise distances (with a quadratic number of operations) and shifting and scaling:
`torch.sum( (x.view(-1,1) - x.view(1,-1))**2 / (2*len(x)*(len(x)-1)) )`

Or the varaince can be conceptualized as caching the mean (with a linear number of operations), computing all distances to that mean (also linear) and then scaling the result:
`torch.sum( (x - torch.mean(x))**2 / (len(x)-1) )`

In the first way, it’s clear that all pairs of values in `x` affect `torch.var`, and so an n by n matrix will be computed by autograd.

In the second (more standard way), the mean is cached, with autograd depicting a dependency of the mean on each value from `x`. Then, the mean collides with every value from `x` to compute the variance. So in this case, it appears the gradients will be quadratic as well. Is that the case?

Also, how is there a way that I could’ve delved into `y.grad_fn` to determine this for myself without asking and without empirically measuring whether the runtimes seem to grow quadratically? Thank you!

Just bumping this in case it got lost during summer holidays. Any thoughts are appreciated!