NaN in Output and Error

Hello,

i am a Newbie in PyTorch and AI and make this for privacy.

My code have to take X numbers (floats) from a list and give me back the X+1 number (float) but all what i become back is:

for Output-tensor

tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],
       device='cuda:0', grad_fn=<ThAddBackward>)

and for loss:

tensor(nan, device='cuda:0', grad_fn=<MseLossBackward>)

i dont know what this is :confused:

Here is my Code, thank you for your help:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.optim as optim
import os

DatasetNumber = 1
DataAmount = 2
matrix1 = torch.Tensor(DataAmount * 5)
matrix2 = torch.Tensor(DataAmount * 5)

class TestNetz(nn.Module):

    # Netz erzeugen

    def __init__(self):

        super(TestNetz, self).__init__()
        self.lin1 = nn.Linear(DataAmount * 5, DataAmount * 5)  # Schichten (Hiddenlayer) (Funktionen die erlernt werden um vom Input auf Output zu kommen)
        self.lin2 = nn.Linear(DataAmount * 5, DataAmount * 5)

    def forward(self, x):
        x = F.log_softmax(self.lin1(x), 0)  # Aktivierungsfunktion relu
        x = self.lin2(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]
        num = 1
        for i in size:
            num *= i
            return num

    # Daten vorbereiten

    k = open('Datenset.txt', 'r')
    lines = k.readlines()

    i = 0
    while i < DataAmount:
        j = 0
        while j < 5:
            matrix1[j + (5 * i)] = float(lines[j + (5 * i) + (DatasetNumber * 5)])
            matrix2[j + (5 * i)] = float(lines[(DatasetNumber * DataAmount) + (DataAmount * 5)])
            j = j + 1
        i = i + 1



    print(matrix1)
    print(matrix2)


netz = TestNetz()
netz = netz.cuda()
print(netz)

if os.path.isfile('TestNetz.pt'):
    netz = torch.load('TestNetz.pt')


for i in range(100):
    # Input
    input = Variable(matrix1)
    input = input.cuda()

    out = netz(input)

    print(out)

    # Ziel
    target = Variable(matrix2)
    target = target.cuda()
    criterion = nn.MSELoss()  # Fehlerberechnung
    loss = criterion(out, target)
    #print(loss)

    netz.zero_grad()
    loss.backward()
    optimizer = optim.SGD(netz.parameters(), lr=0.01)  # Optimizer (SGD) mit Lernrate
    optimizer.step()


torch.save(netz, 'TestNetz.pt')

Could you check your input for NaN values?
Just use

print((matrix1==matrix1).all())

There are a few minor issues in your code:

  • torch.Tensor creates an uninitialized tensor. I would recommend to use e.g. torch.zeros instead, so that the values are zero in case you are not initializing them.
  • It’s uncommon so add F.log_softmax between layers. Usually you would want to use it at your last linear layer in case you have a classification use case. F.relu would be a common non-linearity between layers.
  • Variables are deprecated since 0.4.0. You can directly use tensors instead now.
  • Probably not an issue, but you are not closing the file k. A good way is to use with open('Dataset.txt', 'r') as k: so that the file will be automatically closed.
  • I would suggest to create the criterion and optimizer outside the for loop. It’s not that important for the criterion. In case you would use an optimizer with running estimates (e.g. Adam) you would re-initialize it in each iteration.

Thank you for your Answere,

the reaction of this line:

print((matrix1==matrix1).all())

was this line:

tensor(1, dtype=torch.uint8)

All my Datas are floats here the first 100 Numbers from my List “Datenset.txt”:

1.19532
1.194
1.19525
1.194
1620
1.19447
1.19387
1.19401
1.19417
1382
1.19461
1.19358
1.19415
1.19378
1508
1.19408
1.19377
1.19377
1.19391
1340
1.19466
1.19386
1.19392
1.19438
1318
1.19488
1.19362
1.19437
1.19417
2254
1.19478
1.19371
1.19417
1.19474
1748
1.19474
1.19414
1.19474
1.19422
1140
1.19454
1.19409
1.19421
1.1944
953
1.19491
1.19435
1.1944
1.19486
963
1.19549
1.19482
1.19486
1.19545
1164
1.19583
1.19458
1.19544
1.19521
2341
1.19687
1.19518
1.1952
1.19669
3691
1.19874
1.1967
1.19671
1.1981
3277
1.19848
1.19732
1.19813
1.19802
2924
1.19891
1.19788
1.19801
1.19869
1970
1.19949
1.19843
1.1987
1.19947
2850
1.19983
1.19822
1.19946
1.19866
2410
1.19947
1.19853
1.19865
1.19926
2262
1.20111
1.19875
1.19924
1.20095
4166

Ive try to write all your proposals in my code but the problem is the same:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.optim as optim
import os

DatasetNumber = 1
DataAmount = 2
matrix1 = torch.zeros(DataAmount * 5)
matrix2 = torch.zeros(DataAmount * 5)

class TestNetz(nn.Module):

    # Netz erzeugen

    def __init__(self):

        super(TestNetz, self).__init__()
        self.lin1 = nn.Linear(DataAmount * 5, DataAmount * 5)  # Schichten (Hiddenlayer) (Funktionen die erlernt werden um vom Input auf Output zu kommen)
        F.log_softmax
        self.lin2 = nn.Linear(DataAmount * 5, DataAmount * 5)
        F.log_softmax

    def forward(self, x):
        x = F.relu(self.lin1(x), 0)  # Aktivierungsfunktion relu
        F.log_softmax
        x = self.lin2(x)
        F.log_softmax
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]
        num = 1
        for i in size:
            num *= i
            return num

    # Daten vorbereiten

    k = open('Datenset.txt', 'r')
    lines = k.readlines()

    i = 0
    while i < DataAmount:
        j = 0
        while j < 5:
            matrix1[j + (5 * i)] = float(lines[j + (5 * i) + (DatasetNumber * 5)])
            matrix2[j + (5 * i)] = float(lines[(DatasetNumber * DataAmount) + (DataAmount * 5)])
            j = j + 1
        i = i + 1

    k.close()

    #print(matrix1)
    #print(matrix2)


netz = TestNetz()
netz = netz.cuda()
#print(netz)

if os.path.isfile('TestNetz.pt'):
    netz = torch.load('TestNetz.pt')


for i in range(100):
    # Input
    input = matrix1
    input = input.cuda()

    out = netz(input)

    #print(out)

    # Ziel
    target = matrix2
    target = target.cuda()





torch.save(netz, 'TestNetz.pt')

criterion = nn.MSELoss()  # Fehlerberechnung
loss = criterion(out, target)
# print(loss)

netz.zero_grad()
loss.backward()
optimizer = optim.SGD(netz.parameters(), lr=0.01)  # Optimizer (SGD) mit Lernrate
optimizer.step()

print((matrix1==matrix1).all())
print(matrix1)
print(matrix2)

edit: Ive checked all entries with print(type(entrie)) and all entries are “class float”.

Did you find a solution?

Hello, I have run into the same problem and when I nomarlized my inputs to be in the interval [-1,1], all hidden and output values do not have NaN values anymore. I hope you may have another option to try.