Help!Why grad is always NA

Nid_Carry · October 19, 2021, 1:06pm

It is a [Multiple](javascript: [Linear](javascript: [Regression](javascript: model
but it seems grad is not worked

theta.requires_grad=True
for eproch in range(epochs):
    y_pred=X_b.mm(theta)
    loss=torch.sqrt((y_pred-y).sum())
    loss.backward()
    print(theta.grad)
#     theta-=lr*theta.grad

tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)
tensor([[nan],
[nan],
[nan],
[nan]], device=‘cuda:0’, dtype=torch.float64)

type or paste code here

albanD · October 19, 2021, 3:40pm

Hi,

What are the values of y and y_pred?
The gradient of sqrt at 0 is going to be infinite, leading to nans.