I get nan\inf as an output

Human · April 27, 2023, 7:26pm

I want to make the model able to give me the weight of any mass (F = ma = mass * 9.8 = weight)
I know it’s a stupid idea, but I try to start with the simple things

import torch
import numpy as np

def make_data (n):
  x,y = [],[]
  for i in range(n):
    mass = np.random.randint(1,1000,1)[0]
    x.append([mass])
    y.append([mass*9.8])
  return x,y

data = make_data(2000)
x = torch.autograd.Variable( torch.Tensor(data[0]) )
y = torch.autograd.Variable( torch.Tensor(data[1]) )
lr = 0.1

class ML (torch.nn.Module):
  def __init__ (self):
    super(ML,self).__init__()
    self.layer = torch.nn.Linear(1,1)
  def forward (self,inp):
    out = self.layer(inp)
    return out

model = ML()
entropyLoss = torch.nn.MSELoss(size_average=False)
gradientDecent = torch.optim.SGD(model.parameters(), lr=lr)

for epoch in range(200):
  pred  = model(x)
  error = entropyLoss(pred, y)
  gradientDecent.zero_grad()
  error.backward()
  gradientDecent.step()
  print(f'Epoch {epoch} : {error.item()}')

then i get this :

Epoch 0 : 68154060800.0
Epoch 1 : 1.2420331504344075e+27
Epoch 2 : inf
Epoch 3 : inf
Epoch 4 : inf
Epoch 5 : inf
Epoch 6 : nan
Epoch 7 : nan
....
Epoch 199 : nan

i try also to test it :

>>> model(torch.autograd.Variable(torch.Tensor([[4.0]])))
tensor([[nan]], grad_fn=<AddmmBackward0>)

ptrblck · April 28, 2023, 12:31am

Your learning rate is too high for the calculated loss, which also sums the sample losses.
I.e. in the first iteration you already have a loss of ~1e+10, which will create gradients with a large magnitude and then update the parameters with a learning rate of 0.1.
Eecrease the learning rate to e.g. 1e-8 and remove the size_average=False argument.
This allows the model to train in ~2000 epochs.

Alternatively, normalize the inputs and output and de-normalize them during the model inference phase.

Human · April 29, 2023, 2:50pm

Thanks it works … but i done it with 9999 epoch .