Why mini-batch loss is not equal to sum or average of each sample's loss?

cbats · November 1, 2017, 11:37am

Hi,

I train a official MNIST example, set the random seed to 1 to remove all randomness and set mbsize = 1 or 2 to see the loss change.

I use 2 ways to calculate the 1st and 2nd sample’s loss, one is set mbsize = 1 then run 2 iterations (without optimizer.step()):

    for batch_idx, (data, target) in enumerate(train_loader):
        if args.cuda:
            data, target = data.cuda(), target.cuda()
        data, target = Variable(data), Variable(target)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        print(loss.data[0])
        loss.backward()

The loss of 1st and 2nd samples are:

2.1525135040283203
2.3103649616241455

Another is set mbsize = 2, split the mbsize to each sample and feed-forward respectively.

    for batch_idx, (data, target) in enumerate(train_loader):
        if args.cuda:
        mbsize = data.size()[0]
        optimizer.zero_grad()
        for i in range(mbsize):
            data_x, target_x = Variable(data[i:i+1]), Variable(target[i:i+1])
            # optimizer.zero_grad()
            output = model(data_x)
            loss = F.nll_loss(output, target_x)
            print(loss.data[0])
            loss.backward()

The loss of 1st and 2nd samples are same with above:

2.1525135040283203
2.3103649616241455

However when I set mbsize = 2 and calculate the mini-batch loss together, the loss becames:

2.324004888534546

It seems neither sum nor average of these 2 samples loss, so what is it?

Thanks

cbats · November 2, 2017, 9:15am

I find my mistake, It’s caused by dropout.

Thanks