Hi, I am just starting out on Pytorch. I have encountered a very strange bug in my program that I don’t know whether it is the expected behaviour. My simple test code looks like this
import torch import torch.nn as nn from torch import optim import torch.nn.functional as F import numpy as np def init_weights(m): if type(m) == nn.Linear: torch.nn.init.xavier_uniform_(m.weight) #works # torch.nn.init.normal_(m.weight,mean=1,std=1) #doesn't work # torch.nn.init.uniform_(m.weight) #doesn't work class Model(nn.Module): def __init__(self): super(Model,self).__init__() self.fc1 = nn.Linear(120,60) self.fc2 = nn.Linear(60,40) self.fc3 = nn.Linear(40,1) def forward(self,x): x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = F.relu(self.fc3(x)) return x model = Model() model.apply(init_weights) #Disabling this line lead to expected results criterion = nn.MSELoss() optimizer = optim.SGD(model.parameters(),lr=0.001, momentum=0.9) model =model.cuda() model.train() printFreq=1 for epochNo in range(20): optimizer.zero_grad() targetV = torch.rand(8,1).cuda()+10 inputV = torch.rand(8,120).cuda() output = model(inputV) loss = criterion(output,targetV) loss.backward() optimizer.step() if epochNo % printFreq == 0: print(output)
It’s just a toy example with random input. In theory, the network should learn to disregard the input and learn to output the mean of the target, which is 10. With xavier uniform it is working propery, but with uniform or normal initiation, the network output is always zero after the first backprop. Is this normal behavour? Thanks!