How torch.gradient function works?

ado_sar · April 7, 2023, 10:03pm

I am reading the docs for the torch.gradient function but I can’t understand how the gradient is calculated. Specifically, for the first example where gradient for f(x) = x**2 is approximated, how we can compute a gradient without knowing the function value at f(x+hl) and f(x+hr)? The docs say that the function derivative is approximated with the following formula:

but we don’t know f(x+hl) and f(x+hr).

eqy · April 8, 2023, 5:08am

Note that according to the docs you are passing the values of the function, which would be f(x+...)

    input (Tensor) – the tensor that represents the values of the function

>>> import torch
>>> torch.gradient(torch.tensor([1, 2, 3, 4]))
(tensor([1., 1., 1., 1.]),)
>>> torch.gradient(torch.tensor([1, 2, 3, 4]), spacing=(torch.tensor([1, 2, 3, 4]),))
(tensor([1., 1., 1., 1.]),)
>>> torch.gradient(torch.tensor([1, 2, 3, 4]), spacing=(torch.tensor([2, 4, 6, 8]),))
(tensor([0.5000, 0.5000, 0.5000, 0.5000]),)

ado_sar · April 9, 2023, 8:44am

Still I don’t get how the derivative is calculated. In your examples basically we have the functions y=x and y=0.5*x. Lets focus on the first example where the first coordinate is 1 and the value is 1. I can’t understand how from just these values torch.gradient is able to calculate the derivative which equals 1.

arunppsg · April 9, 2023, 10:14am

For x[i] , hr is x[i+1] - x[i] and hl is x[i] - x[i-1].

Correspondingly, f(x[i] + hr) is f(x[i+1]) and f(x[i] - hl) is f(x[i - 1]).

In the below example, let’s say the gradient has to be computed at index 1 (x = -1).

x = (torch.tensor([-2., -1., 1., 4.]), )

y = torch.tensor([4., 1., 1., 16.])

torch.gradient(y, spacing = x)  # result: (tensor([-3., -2.,  2.,  5.]),)

# for index 1
x = -1
f_x = 1

hr = 2 # (-1 + 2 = 1)
f_x_hr = 1

hl = 1 # (-2 + 1 = -1)
f_x_hl = 4

# by the formula in the docs
g_f_x = ((hl ** 2) * f_x_hr - (hr ** 2) * f_x_hl + (hr ** 2 - hl ** 2) * f_x) / (hr * (hl ** 2) + hl * (hr ** 2))  # gives -2

For the edge cases, like index 0, the gradient seems to be computed (y2 - y1) / (x2 - x1) - the standard slope formula.