Entire dataset, mini batch and epoch

Santhoshnumberone · October 16, 2018, 6:10am

Example:
I have a data set size of 801
60% of my data set is used for training so approx 481 size of my training dataset.

dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=15,shuffle=True, num_workers=15)for x in ['train', 'val']}

Above statement will fetch 15 set(mini batch) of data from the training dataset.
so in order to complete training on the entire data set 481/15 approx= 32
so number of epochs required to complete training over the entire dataset = 32 given be the statement below.
model_ft = train_model(net,criterion,optimizer_ft,exp_lr_scheduler,num_epochs=32)

Am i right, correct me if i am wrong?

ptrblck · October 16, 2018, 6:23am

You will need 32 iterations to complete one epoch. If you use your DataLoader in a for loop, a whole epoch will be completed by giving 32 batches of size 15.

I’m not sure how train_model is defined, but it seems you are confusing iterations/batches with epochs.

Santhoshnumberone · October 16, 2018, 6:43am

example

def train_model(model, criterion, optimizer, scheduler, num_epochs=20):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        for phase in ['train','val']:

            if phase == ' trian':
                scheduler.step()
                model.train()

            else:
                model.eval()
            running_loss = 0.0
            running_corrects = 0


            for inputs,labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                optimizer.zero_grad()

            #forward
            #track history if only in train
                with torch.set_grad_enabled(phase=='train'):
                    outputs = model(inputs)
                    _,preds = torch.max(outputs,1)
                    loss = criterion(outputs,labels)

                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                #statistics
                running_loss += loss.item()*inputs.size(0)
                running_corrects += torch.sum(preds==labels.data)

            epoch_loss = running_loss/dataset_sizes[phase]
            epoch_acc = running_corrects.double()/dataset_sizes[phase]
            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                    phase, epoch_loss, epoch_acc))

            #deep copy the model
            if phase == 'val' and epoch_acc>best_acc:
                best_acc=epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()
    time_elapsed = time.time()-since
    print('Training complete in {:.0f}m {:.0f}s'.format(
    time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

ptrblck · October 16, 2018, 7:26am

Based on this code you are indeed using 32 epochs, i.e. 32 iterations over your full dataset.
In each epoch the loop over your DataLoader (for inputs, labels in dataloaders[phase]) will give you ~32 batches of size 15.

Santhoshnumberone · October 16, 2018, 7:44am

for inputs,labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                optimizer.zero_grad()

            #forward
            #track history if only in train
                with torch.set_grad_enabled(phase=='train'):
                    outputs = model(inputs)
                    _,preds = torch.max(outputs,1)
                    loss = criterion(outputs,labels)

                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                #statistics
                running_loss += loss.item()*inputs.size(0)
                running_corrects += torch.sum(preds==labels.data)

fetches 15 images at once for 32 iterations over the entire training dataset to complete one epoch?

ptrblck · October 16, 2018, 10:54am

Yes, more or less. It will actually give you 33 batches. 32 batches of size 15 and 1 batch of size 1, as the number of samples is not dividable by 15 without a remainder and you didn’t specify drop_last=True in your DataLoader.