Neural Network doesn't learn anything in regression problem

I am playing with a simple problem of regression with neural network. Here is the function to generate the data:

def generate_data():
    while(1):
        x,y,z = [random.uniform(0,100) for _ in range(3)]
        gt = (x**3 + math.log(y)) * math.cos(z)
        yield x,y,z,gt

I generated 36000 data points for this problem. To solve this problem, I proposed this network:

class Net(nn.Module):
    def __init__(self, ):
        super(Net, self).__init__()
        self.l1 = nn.Sequential(nn.Linear(3,500), nn.ReLU())
        self.l2 = nn.Sequential(nn.Linear(500,300), nn.ReLU())
        self.l3 = nn.Sequential(nn.Linear(300,100), nn.ReLU())
        self.l4 = nn.Linear(100,1)

    def forward(self, x):
        x = self.l1(x)
        return self.l4(self.l3(self.l2(x)))

The settings for training as below:

net = Net()
loss_fn = nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr = 0.0001)
### some code for dataset and dataloader here

for _ in range(3000):
    loss_avg = 0
    for x, gt in dataloader:
        pred = net(x)
        loss = loss_fn(pred, gt)
        loss_avg += loss/1000.0
        c+=1
        loss.backward()
        optimizer.step()
    print(f"Loss : {loss_avg}")

However, my model doesn’t learn anything. The loss doesn’t reduce but bounce up and down. What I already tried:

  1. Change the learning rate => doesn’t work
  2. Add the norm layers after RELU => loss quickly grows up to inf.
  3. Divided the input by its max value, doesn’t work either.
  4. Normalize both input and output => loss up and down, didn’t reduce.

Could you help me point down what I did wrong? Or any suggestions to debug or hints?

I think the problem is that you are accumulating the gradients. You need to set them to zero after each gradient update. Try just adding before loss.backward, optimizer.zero_grad() .

Thanks @Link88. Indeed you pointed out what I missed on the pipeline. I put back it into my code. The problem still remains as before. I am still looking for a robust solution (and trying different ways) so if you could find anything else, please let me know.