Forward propagation returns different logits on same samples

Consider the following LeNet model for MNIST

import torch
from torch import nn
import torch.nn.functional as F

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4*4*50, 500)
        self.fc2 = nn.Linear(500, 10)
        self.ceriation = nn.CrossEntropyLoss()
    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(x)
        x = x.view(-1, 4*4*50)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

Now, I use this model to do a single forward step on a batch of samples like

optimizer = torch.optim.SGD(, lr=0.001, momentum=0.9)
device = torch.device("cpu")
# X_batch= ... some batch of 50 samples pulled from a train_loader defined as
# torch.manual_seed(42)
# training_set = datasets.MNIST('./mnist_data', train=True, download=False, 
#                               transform=transforms.Compose([
#                                   transforms.ToTensor(),
#                                   transforms.Normalize((0.1307,), (0.3081,))]))
# train_loader =, 
#                                            batch_size=50, 
#                                            shuffle=False)
logits = network(X_batch)

Note that shuffle=False and download=False for the loader since the data set is already downloaded and I don’t want to shuffle. My problem is that if I run this code twice I will get different values for logits and I don’t understand why since everything else seems to be unchanged. For an extra check, I also extract X_batch to a numpy array and verify that the batch of samples is exactly the same as of previous execution. I do this check with numpy.array_equal() function.

I really can’t figure out what I am missing here unless there are precision issues.

Hello Konstantinos!

When you construct your LeNet (two separate times), you
construct its layers, e.g.,
self.fc1 = nn.Linear(4*4*50, 500), (two separate times).

By default, pytorch randomly initializes the layer weights, so you
get two differing sets of weights, and hence differing output
values for logits.

You can check this by printing out some example weights from
network each individual time you initialize it.


K. Frank

1 Like

You can get exact results if you fix the random seed. More on that here:

1 Like

Thanks a lot for your answer, this helped me narrow down the problem.