Hi guys,
I’m new here and just learn pytorch for 3 days. I tried to implement a demo about convnet but it didn’t work since the output loss at training period was never changed. And I figured out that when I ran loss.backward, the grad was always 0. Here is my demo code, is there something wrong with it ?
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, 3, padding=1)
self.conv2 = nn.Conv2d(10, 1, 1)
def forward(self, _x):
_x = F.relu(self.conv1(_x))
_x = F.softmax(self.conv2(_x))
return _x
class myLoss(nn.Module):
def __init__(self):
super(myLoss, self).__init__()
def forward(self, predict, target):
predict = predict.view(predict.size()[0], -1)
target = target.view(target.size()[0], -1)
return (predict - target).mean()
x = torch.FloatTensor(torch.rand(10, 1, 100, 100))
y = torch.FloatTensor(torch.rand(10, 1, 100, 100))
x = Variable(x)
y = Variable(y)
net = Net()
opt = optim.Adam(net.parameters(), lr=0.1)
cr = myLoss()
for epoch in xrange(1000):
output = net(x)
loss = cr(output, y)
opt.zero_grad()
# print loss
loss.backward()
# for f in net.parameters():
# print(f.grad)
opt.step()
In this code, loss never changed and f.grad was always 0
It’s my misuse of softmax function. I just change the function to sigmoid function. And I think it is advisable to print the output of every layer to see where the mistake happened. Maybe the initial weights , relu function, etc.
Hi @Tao_jiang, thanks for the reply! I just changed the softmax to sigmoid too, and it works!
but do you know why is that?? I heard that softmax+crossentropy is a good combination but why it fails in this case?