this is getting strange by the minute:
- as expected the loss at the loss layer is 1.
- prior to the output layer are - flatten layer --> fully connected (in-4096,out-1) --> Relu (and then MSE loss)
- requesting out.register_hook(print) at these three layers yield -
tensor([[-20.],
[-20.]], device='cuda:0')
tensor([[0.],
[0.]], device='cuda:0')
tensor([[-0., -0., 0., ..., -0., -0., -0.],
[-0., -0., 0., ..., -0., -0., -0.]], device='cuda:0')
loss: 500.0
tensor([[-20.0000],
[-19.9298]], device='cuda:0')
tensor([[ 0.0000],
[-19.9298]], device='cuda:0')
tensor([[-0.0000, -0.0000, 0.0000, ..., -0.0000, -0.0000, -0.0000],
[ 0.0891, 0.2238, -0.0680, ..., 0.2725, 0.2421, 0.2217]],
device='cuda:0')
loss: 498.59820556640625
tensor([[-5.2828],
[ 9.6693]], device='cuda:0')
tensor([[-5.2828],
[ 9.6693]], device='cuda:0')
tensor([[-0.0157, 0.0200, -0.0573, ..., 0.1115, 0.0249, 0.0194],
[ 0.0287, -0.0366, 0.1050, ..., -0.2042, -0.0455, -0.0356]],
device='cuda:0')
loss: 160.7019805908203
tensor([[-20.],
[-20.]], device='cuda:0')
tensor([[0.],
[0.]], device='cuda:0')
tensor([[0., 0., 0., ..., -0., 0., 0.],
[0., 0., 0., ..., -0., 0., 0.]], device='cuda:0')
loss: 500.0
tensor([[-20.],
[-20.]], device='cuda:0')
tensor([[0.],
[0.]], device='cuda:0')
tensor([[0., 0., 0., ..., -0., 0., 0.],
[0., 0., 0., ..., -0., 0., 0.]], device='cuda:0')
loss: 500.0
tensor([[-20.],
[-20.]], device='cuda:0')
tensor([[0.],
[0.]], device='cuda:0')
tensor([[0., 0., 0., ..., -0., 0., 0.],
[0., 0., 0., ..., -0., 0., 0.]], device='cuda:0')
loss: 500.0
what doe’s this mean?
also, i tried another approach and replaced my half_net with a simple 3 layer MLP.
fed one sample and got learning that stopped after 300 epochs without converging to the correct answer.
loss stabilized on a fixed nubmer and refused to move after those 300 epochs.
how is this all related?