I am just experimenting with defining new autograd Function for sigmoid MySigmoid similar to MyReLU in the tutorial, but the output for loss is printing nan
What is wrong?
I am just experimenting with defining new autograd Function for sigmoid MySigmoid similar to MyReLU in the tutorial, but the output for loss is printing nan
What is wrong?
Not sure if it is root of cause, but you shouldn’t use numpy functions.
same problem even if I try torch.exp
I just tried your code on master and the forward works fine. When are you seeing loss becoming nan? First forward? After a backward?
Sorry to revive this topic:
I managed to implement a working version of MySigmoid, but Im also having some issues:
import torch
class MySigmoid(torch.autograd.Function):
@staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
sigmoid_eval = 1.0/(1.0 + torch.exp(-input))
return sigmoid_eval
@staticmethod
def backward(ctx, grad_output):
input, = ctx.saved_tensors
grad_temp = grad_output.clone()
grad_input = torch.exp(-input)/((1.0 + torch.exp(-input)).pow(2))
return grad_input
dtype = torch.float
device = torch.device("cpu")
N, D_in, D_out = 1000, 1, 1
alpha = 1e-2
x = torch.cat(
( torch.randn(N, D_in, device=device, dtype=dtype) ,
torch.ones([N,1], dtype=dtype, device=device) ),
1)
y = x[:,0:1]*3.5 + 2.5*torch.randn(N,1)
w = torch.randn(D_in+1, D_out, device=device, dtype=dtype, requires_grad=True)
print(w.shape)
print(w.dtype)
for t in range(5000):
sigmoid = MySigmoid.apply
y_pred = sigmoid(x.mm(w))
loss = (y_pred - y).pow(2).sum()
print(t, loss.item())
loss.backward()
with torch.no_grad():
w -= alpha * w.grad
w.grad.zero_()
I made sure the dimensions were correct, but still having this issues:
Traceback (most recent call last):
File "test.py", line 49, in <module>
w -= alpha * w.grad
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'
I would appreciate if someone could give a lock to this idea, which I think is very useful for those of us who have just started using torch. Thanks in advance
change to
grad_input =grad_temp* torch.exp(-input)/((1.0 + torch.exp(-input)).pow(2))