A new activation function named “swish” came out and I tried to make a custom layer according to this(http://pytorch.org/docs/master/notes/extending.html#extending-torch-autograd) example and the paper(https://arxiv.org/pdf/1710.05941.pdf).
Is this a proper way of making a custom activation function?
Class Swish(Function):
@staticmethod
def forward(ctx, i):
result = i*i.sigmoid()
ctx.save_for_backward(result,i)
return result
@staticmethod
def backward(ctx, grad_output):
result,i = ctx.saved_variables
sigmoid_x = i.sigmoid()
return grad_output * (result+sigmoid_x*(1-result))
swish= Swish.apply
class Swish_module(nn.Module):
def forward(self,x):
return swish(x)
swish_layer = Swish_module()