Test accuracy with different batch sizes

this is a newby question I am asking here but for some reason, when I change the batch size at test time, the accuracy of my model changes. Decreasing the batch size reduces the accuracy until a batch size of 1 leads to 11% accuracy although the same model gives me 97% accuracy with a test batch size of 512 (I trained it with batch size 512). I am using a pretrained resnet 50 model and finetuning it on my own images and I am also using .train() and .eval() at train and test times properly. The best reason that I can come up with is that for some reason, the batch normalization layers in the model are still tracking the batch statistics at test time (which they are not supposed to do, instead they should be using the ones saved during the training) because a batch size of 1 should lead to mean(x) = x, and the output of bn layer will become 0, leading to zero prediction at the output and hence 11% accuracy because 11% data is from class 0. Also when I send in the validation and test loaders unshuffled, I get bad numbers but when I shuffle them, the same sets give me +96% accuracy. Can somebody help me please? Thank you so much!

And one more thing, I have only changed the classifier at the end and didn’t use bn there. The rest of the model is the same standard implementation that comes with pytorch models module

1 Like

I can’t give exact solution.
But here is my few suggestions
1.Check each class level accuracy.Gives you for different test batches what are things the model not doing well.
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

2.If you dont shuffle you will get bad numbers.Because you see all the labels in your test and train sets.Sometimes the labels will be in order of the classes. which will lead your model to see same class until sometime(Eg:1-100) and then some other class(Eg: 100-200) which will lead to catastrophic forgetting.It’s always better to send as shuffled batches

Thank you for your time and answer but I think I forgot to mention a few details. Firstly, I trained the model with shuffled training set, only the validation and test set were giving me different results with and without shuffling the validation and test loaders. Secondly, my model has more than 97% accuracy on each individual class; it’s only when I change the batch size at test time and my test accuracy keeps on changing and the model performs poorly with a batch size of 1 at test time.

Have you set model to eval mode? [model.eval()]

If so, Can you write the model, test code snippets here?

yes. It is set to .eval(). Let me write the code here! thanks for your time

class ResNet(nn.Module):
"""
    Get a pretrained VGG network (on ImageNet) and try to finetune it on EuroSat images
    Reported acc is > 98% on Resnet-50, let's see what can we get from a VGG network
"""

def __init__(self, in_channels):
    super(ResNet, self).__init__()
    graph = models.resnet50(pretrained=True)
    removed = list(graph.children())[:-2]
    with_dropout = []
    with_dropout.append(removed[0])
    with_dropout.append(removed[1])
    with_dropout.append(removed[2])
    with_dropout.append(removed[3])
    for part in removed[4:]:
        with_dropout.append(part)
        with_dropout.append(nn.Dropout2d(p=0.8))
    # print(with_dropout)
    self.feature_extracter = torch.nn.Sequential(*with_dropout)
    self.kill = nn.Dropout(p=0.8)
    self.classifier = nn.Sequential(
        nn.Linear(in_features=2048*4, out_features=1024),
        nn.ReLU(),
        nn.Linear(in_features=1024, out_features=512),
        nn.ReLU(),
        nn.Dropout(p=0.8),
        nn.Linear(in_features=512, out_features=256),
        nn.ReLU(),
        nn.Linear(in_features=256, out_features=128),
        nn.ReLU(),
        nn.Dropout(p=0.8),
        nn.Linear(in_features=128, out_features=10),
        nn.LogSoftmax(dim=0)
    )

def forward(self, x):
    x = self.feature_extracter(x)
    x = self.kill(x)
    x = self.classifier(x.view(x.size(0), -1))
    return x, torch.argmax(input=x, dim=1)

Here is my training code…

def train_net(model, base_folder, pre_model, save_dir, batch_size, lr, log_after, cuda, device):
    if not pre_model:
        print(model)
    writer = SummaryWriter()
    if cuda:
        print('GPU')
        model.cuda(device=device)
        print('log: training started on device: {}'.format(device))
    # define loss and optimizer
    optimizer = Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()
    train_loader, val_dataloader, test_loader = get_dataloaders(base_folder=base_folder,
                                                                batch_size=batch_size)
    if not os.path.exists(save_dir):
        os.mkdir(save_dir)

    if True:
        i = 1
        m_loss, m_accuracy = [], []
        if pre_model:
            # self.load_state_dict(torch.load(pre_model)['model'])
            model.load_state_dict(torch.load(pre_model))
            print('log: resumed model {} successfully!'.format(pre_model))
            print(model)

            # starting point
            # model_number = int(pre_model.split('/')[1].split('-')[1].split('.')[0])
            model_number = int(re.findall('\d+', str(pre_model))[0])
            i = i + model_number - 1
        else:
            print('log: starting anew using ImageNet weights...')

        while True:
            i += 1
            net_loss = []
            # new model path
            save_path = os.path.join(save_dir, 'model-{}.pt'.format(i))
            # remember to save only five previous models, so
            del_this = os.path.join(save_dir, 'model-{}.pt'.format(i - 6))
            if os.path.exists(del_this):
                os.remove(del_this)
                print('log: removed {}'.format(del_this))

            if i > 1 and not os.path.exists(save_path):
                torch.save(model.state_dict(), save_path)
                print('log: saved {}'.format(save_path))

            correct_count, total_count = 0, 0
            for idx, data in enumerate(train_loader):
                ##########################
                model.train() # train mode at each epoch, just in case...
                ##########################
                test_x, label = data['input'], data['label']
                if cuda:
                    test_x = test_x.cuda(device=device)
                    label = label.cuda(device=device)
                # forward
                out_x, pred = model.forward(test_x)
                # out_x, pred = out_x.cpu(), pred.cpu()
                loss = criterion(out_x, label)
                net_loss.append(loss.item())

                # get accuracy metric
                batch_correct = (label.eq(pred.long())).double().sum().item()
                correct_count += batch_correct
                # print(batch_correct)
                total_count += np.float(pred.size(0))
                if idx % log_after == 0 and idx > 0:
                    print('{}. ({}/{}) image size = {}, loss = {}: accuracy = {}/{}'.format(i,
                                                                                            idx,
                                                                                            len(train_loader),
                                                                                            out_x.size(),
                                                                                            loss.item(),
                                                                                            batch_correct,
                                                                                            pred.size(0)))
                #################################
                # three steps for backprop
                model.zero_grad()
                loss.backward()
                # perform gradient clipping between loss backward and optimizer step
                clip_grad_norm_(model.parameters(), 0.05)
                optimizer.step()
                #################################
            mean_accuracy = correct_count / total_count * 100
            mean_loss = np.asarray(net_loss).mean()
            m_loss.append((i, mean_loss))
            m_accuracy.append((i, mean_accuracy))

            writer.add_scalar(tag='train loss', scalar_value=mean_loss, global_step=i)
            writer.add_scalar(tag='train over_all accuracy', scalar_value=mean_accuracy, global_step=i)

            print('####################################')
            print('epoch {} -> total loss = {:.5f}, total accuracy = {:.5f}%'.format(i, mean_loss, mean_accuracy))
            print('####################################')

            # validate model after each epoch
            eval_net(model=model, writer=writer, criterion=criterion,
                     val_loader=val_dataloader, denominator=batch_size,
                     cuda=cuda, device=device, global_step=i)
    pass

and finally my test code…

def eval_net(**kwargs):
    model = kwargs['model']
    cuda = kwargs['cuda']
    device = kwargs['device']
    if cuda:
        model.cuda(device=device)
    if 'criterion' in kwargs.keys():
        writer = kwargs['writer']
        val_loader = kwargs['val_loader']
        criterion = kwargs['criterion']
        global_step = kwargs['global_step']
        correct_count, total_count = 0, 0
        net_loss = []
        model.eval()  # put in eval mode first ############################
        for idx, data in enumerate(val_loader):
            test_x, label = data['input'], data['label']
            if cuda:
                test_x = test_x.cuda(device=device)
                label = label.cuda(device=device)
            # forward
            out_x, pred = model.forward(test_x)
            loss = criterion(out_x, label)
            net_loss.append(loss.item())

            # get accuracy metric
            batch_correct = (label.eq(pred.long())).double().sum().item()
            correct_count += batch_correct
            total_count += np.float(pred.size(0))
        #################################
        mean_accuracy = correct_count / total_count * 100
        mean_loss = np.asarray(net_loss).mean()
        # summarize mean accuracy
        writer.add_scalar(tag='val. loss', scalar_value=mean_loss, global_step=global_step)
        writer.add_scalar(tag='val. over_all accuracy', scalar_value=mean_accuracy, global_step=global_step)
        print('$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$')
        print('log: validation:: total loss = {:.5f}, total accuracy = {:.5f}%'.format(mean_loss, mean_accuracy))
        print('$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$')

    else:
        # model, images, labels, pre_model, save_dir, sum_dir, batch_size, lr, log_after, cuda
        pre_model = kwargs['pre_model']
        base_folder = kwargs['base_folder']
        batch_size = kwargs['batch_size']
        log_after = kwargs['log_after']
        criterion = nn.CrossEntropyLoss()
        un_confusion_meter = tnt.meter.ConfusionMeter(10, normalized=False)
        confusion_meter = tnt.meter.ConfusionMeter(10, normalized=True)
        model.load_state_dict(torch.load(pre_model))
        print('log: resumed model {} successfully!'.format(pre_model))
        _, _, test_loader = get_dataloaders(base_folder=base_folder, batch_size=batch_size)
        net_accuracy, net_loss = [], []
        correct_count = 0
        total_count = 0
        for idx, data in enumerate(test_loader):
            model.eval()  # put in eval mode first
            test_x, label = data['input'], data['label']
            # print(test_x)
            # print(test_x.shape)
            # this = test_x.numpy().squeeze(0).transpose(1,2,0)
            # print(this.shape, np.min(this), np.max(this))
            if cuda:
                test_x = test_x.cuda(device=device)
                label = label.cuda(device=device)
            # forward
            out_x, pred = model.forward(test_x)
            loss = criterion(out_x, label)
            un_confusion_meter.add(predicted=pred, target=label)
            confusion_meter.add(predicted=pred, target=label)

            ###############################
            # pred = pred.view(-1)
            # pred = pred.cpu().numpy()
            # label = label.cpu().numpy()
            # print(pred.shape, label.shape)

            ###############################
            # get accuracy metric
            # correct_count += np.sum((pred == label))
            # print(pred, label)
            batch_correct = (label.eq(pred.long())).double().sum().item()
            correct_count += batch_correct
            # print(batch_correct)
            total_count += np.float(batch_size)
            net_loss.append(loss.item())
            if idx % log_after == 0:
                print('log: on {}'.format(idx))

            #################################
        mean_loss = np.asarray(net_loss).mean()
        mean_accuracy = correct_count * 100 / total_count
        print(correct_count, total_count)
        print('$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$')
        print('log: test:: total loss = {:.5f}, total accuracy = {:.5f}%'.format(mean_loss, mean_accuracy))
        print('$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$')
        with open('normalized.pkl', 'wb') as this:
            pkl.dump(confusion_meter.value(), this, protocol=pkl.HIGHEST_PROTOCOL)

        with open('un_normalized.pkl', 'wb') as this:
            pkl.dump(un_confusion_meter.value(), this, protocol=pkl.HIGHEST_PROTOCOL)

        pass
    pass

as far as I can see, the code looks correct to me. Can you please print the intermediate output of some layer (say, a linear layer) for the same input with different batch size (in eval mode) to ensure that the model behaves differently with varying batch size?

sure! just a minute…

This is the ouput of the softmax at the final layer for a batch size of 512 (I am omitting some part of it so that the forum let’s me post it…)

-2.690155601501464844e+01 -8.082249450683593750e+01 -3.616087722778320312e+01 -3.803614139556884766e+00 -1.360958194732666016e+01 -4.190422439575195312e+01 -1.716705513000488281e+01 -1.744507598876953125e+01 -1.189443874359130859e+01 -4.641435623168945312e+01
-1.381118583679199219e+01 -1.968629837036132812e+01 -1.586458396911621094e+01 -2.607204627990722656e+01 -3.955638885498046875e+01 -3.761519193649291992e+00 -2.691738891601562500e+01 -3.530610656738281250e+01 -1.903606224060058594e+01 -3.330869674682617188e+01
-2.380311203002929688e+01 -2.999114608764648438e+01 -4.187136650085449219e+00 -2.493428802490234375e+01 -2.390682220458984375e+01 -1.868707847595214844e+01 -1.381414985656738281e+01 -2.050655364990234375e+01 -2.902218627929687500e+01 -5.156520462036132812e+01
-4.055553817749023438e+01 -4.072654724121093750e+00 -3.130643081665039062e+01 -5.238175582885742188e+01 -6.296509552001953125e+01 -4.269557952880859375e+01 -5.359529876708984375e+01 -5.323943710327148438e+01 -4.236874389648437500e+01 -3.005449867248535156e+01
-5.584639358520507812e+01 -8.653271484375000000e+01 -3.353703308105468750e+01 -2.608614921569824219e+01 -1.751148414611816406e+01 -6.213397598266601562e+01 -4.815479278564453125e+01 -4.242191314697265625e+00 -4.553392791748046875e+01 -9.855609893798828125e+01
-1.203245925903320312e+01 -1.687045478820800781e+01 -1.372485160827636719e+01 -2.218313980102539062e+01 -3.322996139526367188e+01 -3.624969482421875000e+00 -2.281215286254882812e+01 -2.994762802124023438e+01 -1.638219070434570312e+01 -2.809778404235839844e+01
-3.542594146728515625e+01 -1.036188278198242188e+02 -4.309928512573242188e+01 -1.152541255950927734e+01 -3.487170934677124023e+00 -5.836009597778320312e+01 -2.234853363037109375e+01 -1.220743560791015625e+01 -2.573122215270996094e+01 -7.186666870117187500e+01
-2.466860389709472656e+01 -3.135066986083984375e+01 -4.166416645050048828e+00 -2.583732986450195312e+01 -2.465964889526367188e+01 -1.941736030578613281e+01 -1.405472564697265625e+01 -2.114776802062988281e+01 -3.017933845520019531e+01 -5.384698104858398438e+01
-4.127064704895019531e+00 -1.877523803710937500e+01 -1.274282455444335938e+01 -9.385772705078125000e+00 -1.298989200592041016e+01 -1.141229343414306641e+01 -8.155844688415527344e+00 -1.573155784606933594e+01 -8.865839004516601562e+00 -1.400935554504394531e+01
-4.084252834320068359e+00 -1.908169937133789062e+01 -1.252236747741699219e+01 -9.234300613403320312e+00 -1.258626079559326172e+01 -1.144107055664062500e+01 -7.732525348663330078e+00 -1.544504737854003906e+01 -9.017238616943359375e+00 -1.454797172546386719e+01
-3.975222396850585938e+01 -4.072445869445800781e+00 -3.078602409362792969e+01 -5.138525772094726562e+01 -6.178840637207031250e+01 -4.187569046020507812e+01 -5.257308197021484375e+01 -5.231076049804687500e+01 -4.155389785766601562e+01 -2.947208976745605469e+01
-3.123294830322265625e+01 -9.020487976074218750e+01 -3.787038803100585938e+01 -1.056235027313232422e+01 -3.583448171615600586e+00 -5.094428253173828125e+01 -1.996979141235351562e+01 -1.125793361663818359e+01 -2.285365486145019531e+01 -6.285837173461914062e+01
-1.499895477294921875e+01 -5.100203704833984375e+01 -1.250618553161621094e+01 -1.890448379516601562e+01 -1.810605621337890625e+01 -2.396559906005859375e+01 -3.685824632644653320e+00 -2.261679649353027344e+01 -2.511251831054687500e+01 -5.419168472290039062e+01
-1.734498023986816406e+01 -2.533816528320312500e+01 -2.026508903503417969e+01 -3.387181854248046875e+01 -5.228700256347656250e+01 -4.035984992980957031e+00 -3.520900726318359375e+01 -4.609833908081054688e+01 -2.428531837463378906e+01 -4.355126571655273438e+01
-1.231824493408203125e+01 -1.733354187011718750e+01 -1.409014320373535156e+01 -2.281948089599609375e+01 -3.427051925659179688e+01 -3.647494316101074219e+00 -2.349081420898437500e+01 -3.083149337768554688e+01 -1.680347251892089844e+01 -2.891879463195800781e+01
-4.168481349945068359e+00 -7.396075439453125000e+01 -4.082975769042968750e+01 -2.257537841796875000e+01 -3.564733886718750000e+01 -4.084517288208007812e+01 -1.726690673828125000e+01 -4.535559082031250000e+01 -2.421278953552246094e+01 -4.959044265747070312e+01
-1.965868186950683594e+01 -5.605549621582031250e+01 -2.587337112426757812e+01 -3.987432479858398438e+00 -1.033438968658447266e+01 -2.947753906250000000e+01 -1.310202026367187500e+01 -1.330910682678222656e+01 -9.715766906738281250e+00 -3.316635131835937500e+01
-3.479168701171875000e+01 -8.166891479492187500e+01 -5.464433288574218750e+01 -2.850371170043945312e+01 -4.872769927978515625e+01 -5.005518341064453125e+01 -4.506411743164062500e+01 -5.162623596191406250e+01 -3.928067922592163086e+00 -3.166503715515136719e+01
-5.030834579467773438e+01 -4.067288398742675781e+00 -3.832155227661132812e+01 -6.484884643554687500e+01 -7.798222351074218750e+01 -5.316851043701171875e+01 -6.647455596923828125e+01 -6.537792968750000000e+01 -5.237901306152343750e+01 -3.686334609985351562e+01
-3.729444503784179688e+01 -8.794846343994140625e+01 -5.875020599365234375e+01 -3.047664260864257812e+01 -5.233389282226562500e+01 -5.387952423095703125e+01 -4.839521026611328125e+01 -5.537379837036132812e+01 -3.903459072113037109e+00 -3.384850311279296875e+01
-1.544962406158447266e+01 -2.230466079711914062e+01 -1.791401100158691406e+01 -2.969276428222656250e+01 -4.547164154052734375e+01 -3.889443159103393555e+00 -3.077057266235351562e+01 -4.032004547119140625e+01 -2.146554374694824219e+01 -3.804282760620117188e+01
-4.058610439300537109e+00 -1.376746749877929688e+01 -9.614157676696777344e+00 -7.770594120025634766e+00 -1.003936576843261719e+01 -8.472715377807617188e+00 -6.603385925292968750e+00 -1.226200103759765625e+01 -7.491374969482421875e+00 -1.115189933776855469e+01
-1.114142704010009766e+01 -2.219174957275390625e+01 -1.596575546264648438e+01 -1.016525650024414062e+01 -1.504627132415771484e+01 -1.393549728393554688e+01 -1.390664672851562500e+01 -1.656463050842285156e+01 -4.159469127655029297e+00 -1.087591171264648438e+01
-2.040389251708984375e+01 -7.088414764404296875e+01 -1.523353958129882812e+01 -2.604238891601562500e+01 -2.423077392578125000e+01 -3.292998504638671875e+01 -3.408622503280639648e+00 -2.998207473754882812e+01 -3.507639312744140625e+01 -7.717485809326171875e+01
-1.422254657745361328e+01 -4.484158706665039062e+01 -1.078249168395996094e+01 -1.766413879394531250e+01 -1.638128280639648438e+01 -2.127746963500976562e+01 -3.704056739807128906e+00 -2.021722412109375000e+01 -2.308171653747558594e+01 -4.890063858032226562e+01
-6.136498641967773438e+01 -9.465768432617187500e+01 -3.610197448730468750e+01 -2.857654571533203125e+01 -1.911444854736328125e+01 -6.813326263427734375e+01 -5.272266387939453125e+01 -4.132856369018554688e+00 -5.017552566528320312e+01 -1.087619400024414062e+02
-2.134333992004394531e+01 -6.163212203979492188e+01 -2.819044303894042969e+01 -3.948542833328247070e+00 -1.096827507019042969e+01 -3.231762313842773438e+01 -1.405270576477050781e+01 -1.413963890075683594e+01 -1.030910110473632812e+01 -3.626287460327148438e+01
-1.611326408386230469e+01 -2.335536193847656250e+01 -1.871491813659667969e+01 -3.114327812194824219e+01 -4.783172607421875000e+01 -3.940395116806030273e+00 -3.230304718017578125e+01 -4.231876754760742188e+01 -2.245649147033691406e+01 -3.998496246337890625e+01
-1.851421737670898438e+01 -2.720315933227539062e+01 -2.169389915466308594e+01 -3.643811416625976562e+01 -5.646389770507812500e+01 -4.125447750091552734e+00 -3.792764663696289062e+01 -4.963721847534179688e+01 -2.603151893615722656e+01 -4.697106552124023438e+01
-5.876892471313476562e+01 -9.092440795898437500e+01 -3.499267959594726562e+01 -2.739908790588378906e+01 -1.834819793701171875e+01 -6.535050201416015625e+01 -5.062402343750000000e+01 -4.184595108032226562e+00 -4.797157287597656250e+01 -1.039532089233398438e+02
-7.917993164062500000e+01 -1.219562988281250000e+02 -4.513483810424804688e+01 -3.630667114257812500e+01 -2.398303985595703125e+01 -8.795501708984375000e+01 -6.752887725830078125e+01 -3.772157192230224609e+00 -6.478369140625000000e+01 -1.415445556640625000e+02
-6.323521041870117188e+01 -9.763261413574218750e+01 -3.710785675048828125e+01 -2.936536788940429688e+01 -1.960568237304687500e+01 -7.025308227539062500e+01 -5.427588272094726562e+01 -4.094451904296875000e+00 -5.168375015258789062e+01 -1.122126770019531250e+02
-8.759860038757324219e+00 -9.706142425537109375e+00 -1.313257217407226562e+01 -1.258201980590820312e+01 -1.610619354248046875e+01 -1.201622867584228516e+01 -1.333721923828125000e+01 -1.675129127502441406e+01 -8.950416564941406250e+00 -4.275278568267822266e+00
-4.126167297363281250e+00 -5.493610382080078125e+01 -3.090803718566894531e+01 -1.784050750732421875e+01 -2.744311523437500000e+01 -3.057565116882324219e+01 -1.382269096374511719e+01 -3.479645156860351562e+01 -1.891181373596191406e+01 -3.745549774169921875e+01
-7.279552459716796875e+00 -7.940578460693359375e+00 -4.283311367034912109e+00 -8.163366317749023438e+00 -8.280757904052734375e+00 -5.495534896850585938e+00 -6.435780048370361328e+00 -8.410377502441406250e+00 -8.367765426635742188e+00 -1.181931114196777344e+01
-2.451872444152832031e+01 -7.291885375976562500e+01 -3.288162612915039062e+01 -3.858707189559936523e+00 -1.271041107177734375e+01 -3.786814117431640625e+01 -1.579201126098632812e+01 -1.628011703491210938e+01 -1.104763412475585938e+01 -4.199660110473632812e+01
-1.759007835388183594e+01 -1.984016227722167969e+01 -2.902506828308105469e+01 -2.685966110229492188e+01 -3.634279251098632812e+01 -2.857014656066894531e+01 -2.931857872009277344e+01 -3.622889328002929688e+01 -1.732245254516601562e+01 -4.110193252563476562e+00
-4.752595138549804688e+01 -4.068867206573486328e+00 -3.630899810791015625e+01 -6.128831100463867188e+01 -7.368952941894531250e+01 -5.017343139648437500e+01 -6.279265213012695312e+01 -6.190445709228515625e+01 -4.952318191528320312e+01 -3.492961502075195312e+01
-1.857810020446777344e+01 -2.122815704345703125e+01 -3.101051902770996094e+01 -2.849893188476562500e+01 -3.873007965087890625e+01 -3.054580879211425781e+01 -3.119842910766601562e+01 -3.857604598999023438e+01 -1.822972869873046875e+01 -4.083828926086425781e+00
-2.072194671630859375e+01 -2.333090972900390625e+01 -3.461248397827148438e+01 -3.194519805908203125e+01 -4.354048538208007812e+01 -3.441202163696289062e+01 -3.499110031127929688e+01 -4.314307403564453125e+01 -2.031342315673828125e+01 -4.055376529693603516e+00
-2.031427383422851562e+01 -3.011381912231445312e+01 -2.396684074401855469e+01 -4.042638397216796875e+01 -6.296613311767578125e+01 -4.263709068298339844e+00 -4.218260192871093750e+01 -5.515073013305664062e+01 -2.871058464050292969e+01 -5.219810485839843750e+01
-3.040139770507812500e+01 -7.058265686035156250e+01 -4.746952056884765625e+01 -2.510282516479492188e+01 -4.246476745605468750e+01 -4.334822845458984375e+01 -3.930010223388671875e+01 -4.510353088378906250e+01 -3.971200942993164062e+00 -2.778851127624511719e+01
-2.200777816772460938e+01 -6.440445709228515625e+01 -2.930238723754882812e+01 -3.922076463699340820e+00 -1.157084560394287109e+01 -3.355789947509765625e+01 -1.436968040466308594e+01 -1.484917449951171875e+01 -1.033179855346679688e+01 -3.752152633666992188e+01
-4.112522602081298828e+00 -1.366378021240234375e+01 -9.944454193115234375e+00 -8.036000251770019531e+00 -1.062376594543457031e+01 -8.556116104125976562e+00 -7.098401069641113281e+00 -1.274724960327148438e+01 -7.442222595214843750e+00 -1.081157684326171875e+01
-1.216378498077392578e+01 -2.036059188842773438e+01 -3.901335000991821289e+00 -1.344072532653808594e+01 -1.114278030395507812e+01 -1.044506072998046875e+01 -4.308685779571533203e+00 -1.221001052856445312e+01 -1.596279716491699219e+01 -2.897383308410644531e+01
-2.955357742309570312e+01 -3.796686553955078125e+01 -4.134057998657226562e+00 -3.081291198730468750e+01 -2.924823760986328125e+01 -2.333085823059082031e+01 -1.617069244384765625e+01 -2.473273658752441406e+01 -3.632819366455078125e+01 -6.569927978515625000e+01
-4.144152164459228516e+00 -3.874533081054687500e+01 -2.298302078247070312e+01 -1.427498626708984375e+01 -2.133903884887695312e+01 -2.217758369445800781e+01 -1.154710483551025391e+01 -2.658267784118652344e+01 -1.449073410034179688e+01 -2.695484924316406250e+01
-1.967094230651855469e+01 -2.210584259033203125e+01 -3.271884918212890625e+01 -3.025439071655273438e+01 -4.114514923095703125e+01 -3.244316101074218750e+01 -3.309582138061523438e+01 -4.083662414550781250e+01 -1.932283210754394531e+01 -4.075018405914306641e+00
-1.156970214843750000e+01 -3.723593139648437500e+01 -1.028598594665527344e+01 -1.435439205169677734e+01 -1.396442890167236328e+01 -1.779047203063964844e+01 -3.847182273864746094e+00 -1.752497673034667969e+01 -1.865164947509765625e+01 -3.903738021850585938e+01
-5.022365951538085938e+01 -4.067265510559082031e+00 -3.827227020263671875e+01 -6.475291442871093750e+01 -7.787551879882812500e+01 -5.308722686767578125e+01 -6.637127685546875000e+01 -6.529892730712890625e+01 -5.229849624633789062e+01 -3.680891036987304688e+01
-2.536238861083984375e+01 -7.591044616699218750e+01 -3.415812683105468750e+01 -3.834064722061157227e+00 -1.320643520355224609e+01 -3.935348892211914062e+01 -1.625482177734375000e+01 -1.688227462768554688e+01 -1.118497657775878906e+01 -4.341341018676757812e+01
-2.682498359680175781e+01 -8.078113555908203125e+01 -3.618523406982421875e+01 -3.800046443939208984e+00 -1.376593875885009766e+01 -4.183815383911132812e+01 -1.708557319641113281e+01 -1.761177635192871094e+01 -1.169411087036132812e+01 -4.609952926635742188e+01
-1.381227779388427734e+01 -3.501851654052734375e+01 -1.634001159667968750e+01 -6.638511180877685547e+00 -3.977850437164306641e+00 -2.028442955017089844e+01 -9.923542022705078125e+00 -7.489879608154296875e+00 -1.102954101562500000e+01 -2.554369735717773438e+01
-3.592675399780273438e+01 -1.052188796997070312e+02 -4.371757507324218750e+01 -1.163378143310546875e+01 -3.475741386413574219e+00 -5.923502349853515625e+01 -2.262634086608886719e+01 -1.231928443908691406e+01 -2.606977272033691406e+01 -7.294050598144531250e+01
-2.420422172546386719e+01 -3.063054275512695312e+01 -4.177126407623291016e+00 -2.535076522827148438e+01 -2.425317764282226562e+01 -1.903213691711425781e+01 -1.392302131652832031e+01 -2.079833793640136719e+01 -2.955807495117187500e+01 -5.262644958496093750e+01
-5.555401611328125000e+01 -8.550027465820312500e+01 -3.286003494262695312e+01 -2.613315963745117188e+01 -1.761076545715332031e+01 -6.153792190551757812e+01 -4.779085540771484375e+01 -4.251764297485351562e+00 -4.553187179565429688e+01 -9.819395446777343750e+01
-1.934419822692871094e+01 -2.855635833740234375e+01 -2.276466178894042969e+01 -3.828952407836914062e+01 -5.948717498779296875e+01 -4.189187049865722656e+00 -3.991167831420898438e+01 -5.220192718505859375e+01 -2.726786804199218750e+01 -4.937199020385742188e+01
-2.511015510559082031e+01 -3.196306610107421875e+01 -4.162716865539550781e+00 -2.629226303100585938e+01 -2.507508087158203125e+01 -1.976881217956542969e+01 -1.423503684997558594e+01 -2.147984123229980469e+01 -3.074202537536621094e+01 -5.493445205688476562e+01
-4.159975528717041016e+00 -4.161846160888671875e+01 -2.468890953063964844e+01 -1.506099700927734375e+01 -2.277891540527343750e+01 -2.383108901977539062e+01 -1.227544212341308594e+01 -2.836776733398437500e+01 -1.523793220520019531e+01 -2.859848403930664062e+01
-2.294694709777832031e+01 -6.732299804687500000e+01 -3.055585289001464844e+01 -3.901713848114013672e+00 -1.190355777740478516e+01 -3.510186767578125000e+01 -1.492453193664550781e+01 -1.526358032226562500e+01 -1.062432289123535156e+01 -3.910974502563476562e+01
-6.751013183593750000e+01 -1.042312164306640625e+02 -3.934717941284179688e+01 -3.121002960205078125e+01 -2.075756835937500000e+01 -7.503845977783203125e+01 -5.786325836181640625e+01 -4.007836341857910156e+00 -5.516710662841796875e+01 -1.200475540161132812e+02
-1.165893077850341797e+01 -2.643288421630859375e+01 -1.295825958251953125e+01 -4.303713321685791016e+00 -4.586702346801757812e+00 -1.463102245330810547e+01 -8.459090232849121094e+00 -6.592896461486816406e+00 -7.630463123321533203e+00 -1.918774032592773438e+01
-1.884151649475097656e+01 -2.773086547851562500e+01 -2.210214614868164062e+01 -3.716085433959960938e+01 -5.764160156250000000e+01 -4.150489330291748047e+00 -3.869596099853515625e+01 -5.063650512695312500e+01 -2.652139663696289062e+01 -4.792680740356445312e+01

and this is what I get for batch_size 1

0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00

something is definitely wrong somewhere that I can’t figure out…

this best I can guess is that it must be capturing the batch statistics at run time and using them to normalize the data which means instance normalization with batch size of 1 and hence zero output since mean(x) = x, but this makes no sense with .eval().

Can you print model.training during your evaluation to ensure, your model is really in eval-mode?

of course. let me do it

yes. It’s printing false. meaning it is in eval mode

but still even if it wasn’t, why do large batch sizes at test time give me better accuracy and 1 doesn’t?

If your model would not be in eval mode, the normalization layers would not use the tracked statistics, but use per-batch stats. If you have only one sample, these stats might not be representative. But since your model is in eval mode, this should not matter.

Are you using any transformations inside your dataloader?

I could imagine, that you might use some normalizations only in your trainloader but not in your val_loader and therefore the images are not normalized properly inside your validation script.

Thanks for your time. I am using augmentations but they shouldn’t matter. here is my dataloader code



from __future__ import print_function
from __future__ import division
import os
import cv2
import gdal
import json
import torch
import random
import numpy as np
random.seed(74)
import matplotlib.pyplot as pl
from torch.utils.data import Dataset, DataLoader
import imgaug as ia
from imgaug import augmenters as iaa


# will implement all functionality (data augmentation) of doing
# 1. random crops,
# 2. random flips,
# 3. random rotations,

all_labels = {
    'AnnualCrop'           : 0,
    'Forest'               : 1,
    'HerbaceousVegetation' : 2,
    'Highway'              : 3,
    'Industrial'           : 4,
    'Pasture'              : 5,
    'PermanentCrop'        : 6,
    'Residential'          : 7,
    'River'                : 8,
    'SeaLake'              : 9
}

def toTensor(image):
    "converts a single input image to tensor"
    # swap color axis because
    # numpy image: H x W x C
    # torch image: C X H X W
    image = image.transpose((2, 0, 1))
    return torch.from_numpy(image).float()

######################################################################################################
# Sometimes(0.5, ...) applies the given augmenter in 50% of all cases,
# e.g. Sometimes(0.5, GaussianBlur(0.3)) would blur roughly every second
# image.
sometimes = lambda aug: iaa.Sometimes(0.5, aug)

# Define our sequence of augmentation steps that will be applied to every image.
seq = iaa.Sequential(
    [
        #
        # Apply the following augmenters to most images.
        #
        iaa.Fliplr(0.5), # horizontally flip 50% of all images
        iaa.Flipud(0.5), # vertically flip 50% of all images

        # crop some of the images by 0-20% of their height/width
        sometimes(iaa.Crop(percent=(0, 0.2))),

        # Apply affine transformations to some of the images
        # - scale to 80-120% of image height/width (each axis independently)
        # - translate by -20 to +20 relative to height/width (per axis)
        # - rotate by -45 to +45 degrees
        # - mode: use any available mode to fill newly created pixels
        #         see API or scikit-image for which modes are available
        sometimes(iaa.Affine(
            scale={"x": (0.8, 1.2), "y": (0.8, 1.2)},
            translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)},
            rotate=(-180, 175),
            mode=ia.ALL
        )),
    ],
    # do all of the above augmentations in random order
    random_order=True
)
######################################################################################################


def get_dataloaders(base_folder, batch_size):
    print('inside dataloading code...')

    class dataset(Dataset):
        def __init__(self, data_dictionary, bands, mode='train'):
            super(dataset, self).__init__()
            self.example_dictionary = data_dictionary
            # with open(mode+'.txt', 'wb') as this:
            #     this.write(json.dumps(self.example_dictionary))
            self.bands = bands # bands are a list bands to use as data, pass them as a list []
            self.mode = mode
            pass

        def __getitem__(self, k):
            example_path, label_name = self.example_dictionary[k]
            # print(example_path, label_name)
            # example is a tiff image, need to use gdal
            this_example = gdal.Open(example_path)
            this_label = all_labels[label_name]
            example_array = this_example.GetRasterBand(self.bands[0]).ReadAsArray()
            for i in self.bands[1:]:
                example_array = np.dstack((example_array,
                                           this_example.GetRasterBand(i).ReadAsArray())).astype(np.int16)

            # transforms
            if self.mode == 'train':
                example_array = np.squeeze(seq.augment_images(
                    (np.expand_dims(example_array, axis=0))), axis=0)
                pass

            # convert 
            example_array = (example_array.astype(np.float ) * 1 /4096)
            example_array = toTensor(image=example_array)
            return {'input': example_array, 'label': this_label}

        def __len__(self):
            return len(self.example_dictionary)

    # create training set examples dictionary
    all_examples = {}
    for folder in sorted(os.listdir(base_folder)):
        # each folder name is a label itself
        # new folder, new dictionary!
        class_examples = []
        inner_path = os.path.join(base_folder, folder)
        for image in [x for x in os.listdir(inner_path) if x.endswith('.tif')]:
            image_path = os.path.join(inner_path, image)
            # for each index as key, we want to have its path and label as its items
            class_examples.append(image_path)
        all_examples[folder] = class_examples

    # split them into train and test
    train_dictionary, val_dictionary, test_dictionary = {}, {}, {}
    for class_name in all_examples.keys():
        class_examples = all_examples[class_name]
        # print(class_examples)
        random.shuffle(class_examples)

        total = len(class_examples)
        train_count = int(total * 0.8); train_ = class_examples[:train_count]
        test = class_examples[train_count:]

        total = len(train_)
        train_count = int(total * 0.9); train = train_[:train_count]
        validation = train_[train_count:]

        for example in train:
            train_dictionary[len(train_dictionary)] = (example, class_name)
        for example in test:
            test_dictionary[len(test_dictionary)] = (example, class_name)
        for example in validation:
            val_dictionary[len(val_dictionary)] = (example, class_name)


    # create dataset class instances
    bands = [4, 3, 2]
    train_data = dataset(data_dictionary=train_dictionary, bands=bands, mode='train')
    val_data = dataset(data_dictionary=val_dictionary, bands=bands, mode='eval')
    test_data = dataset(data_dictionary=test_dictionary, bands=bands, mode='test')
    print('train examples =', len(train_dictionary), 'val examples =', len(val_dictionary),
          'test examples =', len(test_dictionary))

    train_dataloader = DataLoader(dataset=train_data, batch_size=batch_size,
                                  shuffle=True, num_workers=4)
    val_dataloader = DataLoader(dataset=val_data, batch_size=batch_size,
                                shuffle=True, num_workers=4)
    test_dataloader = DataLoader(dataset=test_data, batch_size=batch_size,
                                 shuffle=True, num_workers=4)

    return train_dataloader, val_dataloader, test_dataloader


def histogram_equalization(in_image):
    for i in range(in_image.shape[2]): # each channel
        image = in_image[: ,: ,i]
        prev_shape = image.shape
        # Flatten the image into 1 dimension: pixels
        pixels = image.flatten()

        # Generate a cumulative histogram
        cdf, bins, patches = pl.hist(pixels, bins=256, range=(0 ,256), normed=True, cumulative=True)
        new_pixels = np.interp(pixels, bins[:-1], cdf *255)
        in_image[: ,: ,i] = new_pixels.reshape(prev_shape)
    return in_image


def main():
    train_dataloader, val_dataloader, test_dataloader = get_dataloaders(base_folder='/home/annus/Desktop/'
                                                                                    'forest_cover_change/'
                                                                                    'eurosat/images/tif',
                                                                        batch_size=1)
    # #
    # train_dataloader, val_dataloader, test_dataloader = get_dataloaders(base_folder='Eurosat/tif/',
    #                                                                     batch_size=16)

    count = 0
    reversed_labels = {v :k for k, v in all_labels.iteritems()}
    while True:
        count += 1
        for idx, data in enumerate(train_dataloader):
            examples, labels = data['input'], data['label']
            print('{} -> on batch {}/{}, {}'.format(count, idx +1, len(train_dataloader), examples.size()))
            if True:
                this = np.max(examples[0].numpy())
                print(this)
                this = (examples[0].numpy( ) *255).transpose(1 ,2 ,0).astype(np.uint8)
                # this = histogram_equalization(this)
                pl.imshow(this)
                pl.title('{}'.format(reversed_labels[int(labels.numpy())]))
                pl.show()


if __name__ == '__main__':
    main()















BTW, I divide by 4096 because my images are 12-bit mosaics of satellite imagery (sentinel-2 images). I just thought bringing them to 0-1 range might help in learning. The test function at the end I used to see if i could reconstruct the images and it was working perfectly.

Thank you so much everyone for your help. Steve_cruz helped me solve my error. I retrained my model be removing the last softmax layer since cross entropy loss applies softmax itself. Also I decorated my evaluation function with torch.no_grad() and got the model running. Now it is giving me good accuracy with any batch size. But still those accuracies vary somewhere between 2-3% (93-95%) for different batch sizes for some reason. I’ll try to find some fix for that. Thanks for your time everyone!

1 Like

It seems I have similar problem with you. As the validation batch size gets smaller, the validation accuracy gets worse. Could you please explain your solution in more detail? Thank you.

1 Like

I think maybe you calculate the accuracy incorrectly. avg_accuracy=total_correct_samples/total_samples ; But not avg_accuracy=(acc1+acc2+...+accN)/N