Hi
Thank you for your quick reply,
for more info:
I’m using a DNN with 2 hidden layers and use cross entropy as loss function, the batch size is 32
criterion=nn.CrossEntropyLoss()
the output of the code in my post is like :
tensor([[0., 0., 0., …, 0., 0., 1.],
[1., 0., 0., …, 0., 1., 1.],
[0., 0., 0., …, 0., 1., 1.],
…,
[1., 0., 0., …, 0., 0., 0.],
[0., 0., 0., …, 0., 0., 1.],
[1., 0., 0., …, 0., 1., 1.]])
32
tensor([[ 0.6209, -0.8086, 0.1675, …, -0.2908, -1.3713, -1.6336],
[ 0.5608, -0.6456, 0.2424, …, -0.4206, -1.1513, -1.7551],
[ 0.5057, -0.9428, 0.2893, …, -0.5379, -1.2654, -1.5686],
…,
[ 0.6501, -0.6999, 0.3204, …, -0.3298, -1.3569, -1.6235],
[ 0.8948, -0.9117, 0.1544, …, -0.3279, -1.1618, -1.6269],
[ 0.7063, -0.7219, 0.2326, …, -0.3535, -1.2034, -1.7179]],
grad_fn= ThAddmmBackward> )
tensor(5.2543, grad_fn=NllLossBackward>)
done batch 0
The strange thing is when I change the batch size to 1, loss_train.backward() not work at all,
output like this if the batch size is 1:
begin batch 0
tensor([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.]])
1
tensor([[-0.7247, 1.0524, -2.2573, -1.1343, -0.4040, -0.3480, 1.0350, 1.2908,
1.9561, 0.6749, 0.2105, -1.3460, 1.5483, 1.3202, -1.6915, -2.9539,
0.4566, -0.0062, -0.7644, 1.4134, -2.7010, 0.1742, 3.0185, -0.8109,
-0.1226, -0.1662, 1.5913, -1.1074, 1.0465, -0.3631, 0.7734, -1.1649,
-0.7790, -0.4970, -3.5045, 1.2011, -1.0401, -1.7327, 0.3079, 0.1145,
3.1292, -0.2242, -1.2624, 0.1882, 0.7860]],
grad_fn=ThAddmmBackward>)
tensor(1.6626, grad_fn=NllLossBackward>)