Backward() None issue with autograd

xavier · November 7, 2019, 2:20pm

Hi everyone. i have this piece of code and i am wondering the difference execution when i change some lines.

with this code i have the following output:

but when i change this section :

like this :

the code works normally. Here is the output :

i tought that

please could someone explain me these behavior ?
Thanks

beaupreda · November 7, 2019, 2:49pm

I believe this will explain the problem you are facing.

albanD · November 7, 2019, 2:51pm

Hi,

The difference is that:

a = a - 5 will create a new tensor that contains the result of a - 5 and then set the variable a to point to that new Tensor.
a -= 5 will change the Tensor pointed by a inplace and by doing a - 5 and saving it in the original Tensor.

xavier · November 7, 2019, 3:27pm

Thank you both for your answer, i see it clearly now.
here i use the -= operator and i need to zero the gradient graph in order to loop the gradient computation.

lr=0.02
b=torch.tensor([0.7], dtype=torch.float32, requires_grad=True) #bias
w=torch.tensor([0.3,-0.8], dtype=torch.float32, requires_grad=True) #weights
inputs=torch.tensor([[0.5, 1.2],[-0.8, 0.6]])
y=torch.tensor([1],dtype=torch.float32)


for i in range (5):
    print("Iteration=",i)
    y_hat=h_calculate(inputs,w,b)
    y_hat_final=torch.sigmoid(y_hat)
    loss=error(y_hat_final,y)
    loss.backward()
    print(loss)
    with torch.no_grad():
        #w=w-lr*w.grad
        #b=b-lr*b.grad
        #print(w.requires_grad)
        w-=lr*w.grad
        b-=lr*b.grad
        #print(w.requires_grad)
        print(w)
        print(b)
        #print(w.grad)
        #print(b.grad)
        w.grad.zero_()
        b.grad.zero_()

but here i don’t anymore need to zero gradient graph since w and b are pointing on a new Tensor, i just have to set requires_grad to True, to be able to compute the gradient in the next loop.

b=torch.tensor([0.7], dtype=torch.float32, requires_grad=True) #bias
w=torch.tensor([0.3,-0.8], dtype=torch.float32, requires_grad=True)
y_hat=h_calculate(inputs,w,b)
y_hat_final=torch.sigmoid(y_hat)
loss=error(y_hat_final,y)
for i in range (5):
    print("i=",i)
    y_hat=h_calculate(inputs,w,b)
    y_hat_final=torch.sigmoid(y_hat)
    loss=error(y_hat_final,y)
    loss.backward()
    print(loss)
    with torch.no_grad():
        w=w-lr*w.grad
        b=b-lr*b.grad
        w.requires_grad=True
        b.requires_grad=True
        #w-=lr*w.grad
        #b-=lr*b.grad
        print(w)
        print(b)
        print(w.grad)
        print(b.grad)

these two codes give the same output.

thank again.