My NN's forward pass gives the same output

narawat · September 29, 2017, 10:25am

Hi everyone,
I am learning neural network and pyTorch. Now I have a problem I don’t understand. I searched this forum without success. The problem is I get the same output when I validate the network but when I did forward pass while training, the network’s output varies like it should be.

here is my code:

class NN3Layer(nn.Module):
def __init__(self, input_dimension, hidden1_dimension, hidden2_dimension, output_dimension, learning_rate=0.01):
    super().__init__()

    self.input_dimension = input_dimension
    self.hidden1_dimension = hidden1_dimension
    self.hidden2_dimension = hidden2_dimension
    self.output_dimension = output_dimension

    self.model = torch.nn.Sequential(
        nn.Linear(self.input_dimension, self.hidden1_dimension),
        nn.ReLU(),
        nn.Linear(self.hidden1_dimension, self.hidden2_dimension),
        nn.ReLU(),
        nn.Linear(self.hidden2_dimension, self.output_dimension),
    )

    self.loss_function = nn.MSELoss()
    self.learning_rate = learning_rate
    self.optimizer = optim.SGD(self.model.parameters(), self.learning_rate)

def forward(self, input):
    x = Variable(torch.FloatTensor(input))
    output = self.model(x)

    return output

def train_model(self, input, target):
    input_variable = Variable(torch.FloatTensor(input))
    target_variable = Variable(torch.FloatTensor(target), requires_grad=False)

    output = self.model(input_variable)
    print('training output is ', output)

    loss = self.loss_function(output, target_variable)
    self.optimizer.zero_grad()
    loss.backward()
    self.optimizer.step()


myNN = NN3Layer(5, 3, 2, 1)

# train a model
for item in train_data:
    input_list = item[:-1]
    print('training input list are ', input_list)
    target = [item[-1]]
    myNN.train_model(input_list, target)

# validate a model
for item in validate_data:
    validate_list = item[:-1]
    validate_target = [item[-1]]
    print('validating input list are ', validate_list)

    result = myNN.forward(validate_list)
    print('validate output is ', result)
print(list(myNN.parameters()))

This is what I get:

training input list are  [0.86, 0.265, 0.747, 2.7111, 0.0052]
training output is  Variable containing:
16809.6094
[torch.FloatTensor of size 1]

training input list are  [0.86, 0.265, 0.748, 2.7133, 0.0052]
training output is  Variable containing:
16988.2969
[torch.FloatTensor of size 1]

validating input list are  [0.86, 0.265, 0.7490000000000001, 2.7177, 0.0052]
validate output is   Variable containing:
16966.4102
[torch.FloatTensor of size 1]

validating input list are  [0.86, 0.265, 0.753, 2.6585, 0.0051]
validate output is   Variable containing:
16966.4102
[torch.FloatTensor of size 1]

validating input list are  [0.78, 0.26899999999999996, 0.818, 2.5598, 0.0049]
validate output is  Variable containing:
16966.4102
[torch.FloatTensor of size 1]

[Parameter containing:
-7.1699e+05         nan         nan -4.3887e+05 -8.4361e+02
-1.0204e+06         nan         nan -6.2457e+05 -1.2003e+03
-2.1493e+01         nan         nan -5.9462e+01  1.4713e-01
[torch.FloatTensor of size 3x5]
, Parameter containing:
1.00000e+05 *
-4.6862
-6.6691
-0.0002
[torch.FloatTensor of size 3]
, Parameter containing:
-1.8504e+06 -1.0960e+05  3.7260e-01
-4.1361e-01 -9.5251e-02  9.6318e+01
[torch.FloatTensor of size 2x3]
, Parameter containing:
-54996.9258
-2619.3457
[torch.FloatTensor of size 2]
, Parameter containing:
1.00000e+05 *
-1.4817 -0.1068
[torch.FloatTensor of size 1x2]
, Parameter containing:
16966.4102
[torch.FloatTensor of size 1]
]

training output is 16809.6094 and 16988.2969. Validate output stay the same at 16966.4102
Why training output varies but validate output is the same even though they calculate from the same model and weight? Is something wrong in my code? Any advise or guidance would be greatly appreciated.

chenyuntc · September 29, 2017, 10:52am

I guess you have run into gradient problem: output 16966.4102 is the value of bias of last linear layer. It suggests wx = 0,because output = wx+b.

two advice:

use Adam instead of SGD
add batchnorm before ReLU

narawat · September 30, 2017, 11:51pm

Thank you very much for your advice. I will try and this weekend. But I think if it is because of gradient the network should have the problem since when I train the network but it isn’t.