snsmssss
(Sambath Narayanan Parthasarathy)
August 1, 2020, 10:53am
1
Background
-I took a working Python code for sample neural net( analtically derive gradient, Victor Zhou Blog)
-I converted it into PyTorch with autograd
-My aim is to explain NN through Pytorch & GPU/CUDA - to beginners
-I have hard coded weights to make the compute deterministic
-CPU version
-Python VERSION: 3.8.3
-pyTorch VERSION: 1.5.1
Problem
-Though Python code converges, pytorch code terminates with Error & NAN
-Am I making any basic mistakes?( I am new to PyTorch)
-since pure Python version works fine, why torch code explodes with NAN?
after running for 80 epochs
PyTorch code given below:
OUTPUT
Kushaj
(Kushajveer Singh)
August 1, 2020, 10:20pm
2
Maybe normalizing the inputs can help.
snsmssss
(Sambath Narayanan Parthasarathy)
August 2, 2020, 2:46am
3
I did that as well, still … problem persisits
Kushaj
(Kushajveer Singh)
August 2, 2020, 2:07pm
4
data = (data - data.mean())/data.std()
Loss is converging
Epoch 0 loss: 0.326
Epoch 10 loss: 0.309
Epoch 20 loss: 0.263
Epoch 30 loss: 0.200
Epoch 40 loss: 0.166
Epoch 50 loss: 0.161
Epoch 60 loss: 0.145
Epoch 70 loss: 0.102
Epoch 80 loss: 0.051
Epoch 90 loss: 0.020
Epoch 100 loss: 0.010
Epoch 110 loss: 0.009
Epoch 120 loss: 0.009
Epoch 130 loss: 0.009
Epoch 140 loss: 0.009
Epoch 150 loss: 0.007
Epoch 160 loss: 0.006
Epoch 170 loss: 0.004
Epoch 180 loss: 0.003
Epoch 190 loss: 0.002
Epoch 200 loss: 0.001
Epoch 210 loss: 0.001
Epoch 220 loss: 0.001
Epoch 230 loss: 0.000
Epoch 240 loss: 0.000
Epoch 250 loss: 0.000
Epoch 260 loss: 0.000
Epoch 270 loss: 0.000
Epoch 280 loss: 0.000
Epoch 290 loss: 0.000
Epoch 300 loss: 0.000
Epoch 310 loss: 0.000
Epoch 320 loss: 0.000
Epoch 330 loss: 0.000
Epoch 340 loss: 0.000
Epoch 350 loss: 0.000
snsmssss
(Sambath Narayanan Parthasarathy)
August 4, 2020, 11:57am
5
Kushaj: Thank you ! What did you find & change.
I found one issue with the way I had defined in my code
all_y_trues = torch.tensor([
1.,
0.,
0.,
1.])
I changed it to
all_y_trues = torch.tensor([
[1.],
[0.],
[0.],
[1.]])
it doesn’t explode now in the midway of epochs iteration
Kushaj
(Kushajveer Singh)
August 5, 2020, 11:18am
6
I only normalized the input data by adding this line data = (data - data.mean())/data.std()
.