MySigmoid(torch.autograd.Function)

Sumanth_Nandamuri · December 25, 2017, 7:53pm

I am just experimenting with defining new autograd Function for sigmoid MySigmoid similar to MyReLU in the tutorial, but the output for loss is printing nan

What is wrong?

SimonW · December 25, 2017, 7:57pm

Not sure if it is root of cause, but you shouldn’t use numpy functions.

Sumanth_Nandamuri · December 25, 2017, 8:02pm

same problem even if I try torch.exp

SimonW · January 2, 2018, 7:56pm

I just tried your code on master and the forward works fine. When are you seeing loss becoming nan? First forward? After a backward?

luis_vasquez · December 18, 2018, 4:45pm

Sorry to revive this topic:
I managed to implement a working version of MySigmoid, but Im also having some issues:

import torch

class MySigmoid(torch.autograd.Function):
    @staticmethod
    def forward(ctx, input):
        ctx.save_for_backward(input)
        sigmoid_eval = 1.0/(1.0 + torch.exp(-input))
        return sigmoid_eval

    @staticmethod
    def backward(ctx, grad_output):
        input, = ctx.saved_tensors
        grad_temp = grad_output.clone()
        grad_input = torch.exp(-input)/((1.0 + torch.exp(-input)).pow(2))
        return grad_input

dtype = torch.float
device = torch.device("cpu")

N, D_in, D_out = 1000, 1, 1
alpha = 1e-2



x = torch.cat(
                (   torch.randn(N, D_in, device=device, dtype=dtype) ,
                    torch.ones([N,1], dtype=dtype, device=device)   ),
                1)
y = x[:,0:1]*3.5 + 2.5*torch.randn(N,1)
w = torch.randn(D_in+1, D_out, device=device, dtype=dtype, requires_grad=True)

print(w.shape)
print(w.dtype)

for t in range(5000):
    sigmoid = MySigmoid.apply
    
    y_pred = sigmoid(x.mm(w))

    loss = (y_pred - y).pow(2).sum()
    print(t, loss.item())

    loss.backward()
    
    with torch.no_grad():
        w -=  alpha * w.grad
        w.grad.zero_()

I made sure the dimensions were correct, but still having this issues:

If I try to multiply w by a scalar I get this:

Traceback (most recent call last):
  File "test.py", line 49, in <module>
    w -=  alpha * w.grad
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

The convergence is taking too long (reason why Im trying to modify starting w values)

I would appreciate if someone could give a lock to this idea, which I think is very useful for those of us who have just started using torch. Thanks in advance

Jesse · May 30, 2019, 6:34am

change to
grad_input =grad_temp* torch.exp(-input)/((1.0 + torch.exp(-input)).pow(2))