I am just experimenting with defining new autograd Function for sigmoid MySigmoid similar to MyReLU in the tutorial, but the output for loss is printing nan

What is wrong?

Not sure if it is root of cause, but you shouldn’t use numpy functions.

same problem even if I try torch.exp

I just tried your code on master and the forward works fine. When are you seeing loss becoming nan? First forward? After a backward?

Sorry to revive this topic:
I managed to implement a working version of MySigmoid, but Im also having some issues:

import torch

class MySigmoid(torch.autograd.Function):
    def forward(ctx, input):
        sigmoid_eval = 1.0/(1.0 + torch.exp(-input))
        return sigmoid_eval

    def backward(ctx, grad_output):
        input, = ctx.saved_tensors
        grad_temp = grad_output.clone()
        grad_input = torch.exp(-input)/((1.0 + torch.exp(-input)).pow(2))
        return grad_input

dtype = torch.float
device = torch.device("cpu")

N, D_in, D_out = 1000, 1, 1
alpha = 1e-2

x =
                (   torch.randn(N, D_in, device=device, dtype=dtype) ,
                    torch.ones([N,1], dtype=dtype, device=device)   ),
y = x[:,0:1]*3.5 + 2.5*torch.randn(N,1)
w = torch.randn(D_in+1, D_out, device=device, dtype=dtype, requires_grad=True)


for t in range(5000):
    sigmoid = MySigmoid.apply
    y_pred = sigmoid(

    loss = (y_pred - y).pow(2).sum()
    print(t, loss.item())

    with torch.no_grad():
        w -=  alpha * w.grad

I made sure the dimensions were correct, but still having this issues:

  • If I try to multiply w by a scalar I get this:
Traceback (most recent call last):
  File "", line 49, in <module>
    w -=  alpha * w.grad
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'
  • The convergence is taking too long (reason why Im trying to modify starting w values)

I would appreciate if someone could give a lock to this idea, which I think is very useful for those of us who have just started using torch. Thanks in advance

change to :blush:
grad_input =grad_temp* torch.exp(-input)/((1.0 + torch.exp(-input)).pow(2))