I tried to write my custom layer for sigmoid. But it doesn’t work. The network is not trained. If I replace the activation function with the standard torch.nn.Sigmoid() then it starts learning and achieves great accuracy. I can’t figure out what’s wrong with my custom sigmoid Help me please, guys! Thanks
Hi Frank!
Thank you for your answer! Yes, I also checked that the result of my module is the same as the library. But for some reason, with my sigmoid module, the network is not trained. I still can’t figure out why maybe in torch.nn.Sigmoid somehow works with gradients in a special way… idk
It turns out the the gradient for torch.sigmoid() is somewhat worse
than it could be. Your SigmoidCustom displays similar imperfect
behavior, but is rather better than torch.sigmoid().
See this post:
Nonetheless, I think it is unlikely that this difference between your SigmoidCustom and pytorch’s standard implementation explains
the issue you are seeing. Both versions of sigmoid() are basically
okay, and the imperfection in their gradients is more of an edge case.
I would only think this would matter if your network were close to
training unstably.
I would suggest trying something like this:
class LinearNet(torch.nn.Module):
def __init__(self, in_features, out_features, hid_neurons=200, useCustom = True):
super(LinearNet, self).__init__()
if useCustom:
sig = SigmoidCustom()
else:
sig = torch.nn.Sigmoid()
self.lin1 = LinearCustom(in_features, hid_neurons, bias=True)
self.act1 = sig
self.lin2 = LinearCustom(hid_neurons, hid_neurons, bias=True)
self.act2 = sig
self.lin3 = LinearCustom(hid_neurons, out_features, bias=True)
self.softmax = torch.nn.Softmax(dim=1)
This way you can switch between the standard and custom sigmoid()
with a single flag and not risk letting some other bug creep in by changing
additional code.