# Sometimes got zeros output when training xor task for a small network

I am training a small network, focusing on the task xor. However, sometimes it will output an all-zeros tensor on the training data.

``````    train = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
label = torch.tensor([0, 1, 1, 0], dtype=torch.float32).reshape(4, 1)
loss_fn = torch.nn.MSELoss()
lr = 0.005
net = torch.nn.Sequential(
torch.nn.Linear(2, 2),
torch.nn.ReLU(),
torch.nn.Linear(2, 1),
torch.nn.ReLU()
)

for i in range(1000):

output = net(train)
loss = loss_fn(output, label)
loss.backward()
# print(loss.item())

for p in net.parameters():

print(net(train))
``````

So is my code wrong? Or just because of the randomly initialized network parameter’s value?

Hi,

Are you sure you need the last `ReLU` in your net?
Also I loosely remember seeing that the point where all weights are 0s is problematic for xor with very small nets.

In `Linear`'s construction, `reset_parameter` will be called.

``````    def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
``````

I think the initialized parameters have a low probability to be all 0s. But the all-zeros output happened frequently. I’ve checked the `.grad` for net parameter in each loop of iteration when the all-zeros output happen, the result is also all-zeros.