I was trying to reproduce a model where the activation functions is defined as:
x / sqrt(1 + x**2)
were x are vectors
I was wondering how can I define this operation and still benefit from autograd magic.
Thank you very much in advance.
I think that if you write a simple function like
num = x
den = torch.sqrt(1.0 + torch.mul(x,x))
return torch.div(num, dev)
Provided that “x” is a torch tensor and that you use the output to compute (directly or indirectly) the loss value L, then the gradients will be automatically handled by the autograd package when you call L.backward().
Alternatively you can define a custom nn.Module, overriding its forward() method with the same code as in the function above, and then use it as a new layer in a DNN (for example).
Hope this helps
Thank you very much.
I tried the first solution you propose and worked perfectly!!