Pytorch Tensor and Numpy array show different result with same code

Hi, I was implementing a simple differentiation with Pytorch Tensor, and the results were strange compared to Numpy array. When I used Pytorch Tensor, I checked that the tmp_val value changes after calculating multi_func(x), while Numpy array did not.

Why is this happening and how do I modify the code to get the same results as the Numpy array?

def multi_func(x):
    return x[0]**2 + x[1]**2 


def derivative(f, x):
    h = 1e-4 # 0.0001
    grad = torch.zeros_like(x) 
    
    for idx in range(x.size()[0]):
        tmp_val = x[idx]
        
        # f(x+h)
        x[idx] = float(tmp_val) + h
        fxh1 = f(x)

        print(tmp_val)
        
        # f(x-h) 
        x[idx] = tmp_val - h 
        fxh2 = f(x) 
        
        grad[idx] = (fxh1 - fxh2) / (2*h)
        x[idx] = tmp_val
        
    return grad

print(derivative(multi_func, torch.tensor([3.0,4.0])))

tensor(3.0001)
tensor(4.0001)
tensor([2.9945, 4.0054])

=========================================================

def multi_func(x):
    return x[0]**2 + x[1]**2 

def derivative(f, x):
    h = 1e-4 # 0.0001
    grad = np.zeros_like(x) 
    
    for idx in range(x.size):
        tmp_val = x[idx]
        
        # f(x+h) 
        x[idx] = float(tmp_val) + h
        fxh1 = f(x)
        
        print(tmp_val)
        
        # f(x-h) 
        x[idx] = tmp_val - h 
        fxh2 = f(x) 
        
        grad[idx] = (fxh1 - fxh2) / (2*h)
        x[idx] = tmp_val
        
    return grad

print(derivative(multi_func, np.array([3.0,4.0])))

3.0
4.0
[6. 8.]

You are assigning x[idx] to tmp_val, but it’s not a copy, it’s simply a view of your tensor at that index. this means that tmp_val is referencing the same memory space as x[idx], which means that changing x[idx] changes tmp_val.

You’re adding h=1e-4 to x[idx], which means that tmp_val is now pointing to the value x[idx]+h which is 3.0001 and 4.0001

Numpy index selection on the other hand doesn’t seem to give a view, but instead a copy (I thought it also provides a view, but maybe for one item it gives the copy).

This is why your code with numpy works as intended. To remedy to the problem in pytorch, simply use tmp_val = x[idx].clone() to copy the tensor. This should yield the result you wanted.

One more thing: numpy by default uses double (float64) for float values, while pytorch uses float (float32). This means that your operations might not give the same results in terms of precision. You can set the default precision as per this

Hope this answers your question.

1 Like

Awesome!!! Thanks a lot @Youyoun !! The question has been completely resolved. Why didn’t I think that the mechanism would be different between the Pytorch and Numpy…? Thank you so much. I got a lot of inspiration.

1 Like