On dropout-enhanced CNN training codes

Hi all,

I just added the dropout codes in order to avoid the overfitting problem from the basic CNN training codes as follows:

class DropoutFC(nn.Module):
def init(self):
super(DropoutFC, self).init()
self.fc = nn.Linear(100,20)
self.dropout = nn.Dropout(p=0.5)

def forward(self, input):
  out = self.fc(input)
  out = self.dropout(out)
  return out

Net = DropoutFC()

My question is therefore, besides the codes above, if additional codes should be added into the original python file anywhere else?

By the way, the codes to use the class of Train is also shown here:
test_train = Train()

I’m not sure I understand your question.
You have created this module to combine a linear and Dropout layer, both with fixed parameters.

It looks like your model just consists of this DropoutFC layer. Usually you put Dropout between two consecutive linear layers. Currently you would drop some of your outputs.

What do you mean by “original python file” ?
Where would you like to put this code?

What class is Train()? It looks like a high-level wrapper.

Hi ptrblck,

Thanks for your reply. By ‘originial python file’ I mean the training codes without using the dropout techniques. Below, I show you the enhanced version where the codes at the beginning part, including several import *** and the definition of one performance metric, have been omitted.

class DropoutFC(nn.Module):
    def __init__(self):
      super(DropoutFC, self).__init__()
      self.fc = nn.Linear(100,20)
      self.dropout = nn.Dropout(p=0.5)

    def forward(self, input):
      out = self.fc(input)
      out = self.dropout(out)
      return out
Net = DropoutFC() 

class Train:
    def __init__(self, root_path = "/home/simon/Downloads/TrainFramesSS/", model_name = "vgg16", number_classes = 2, path_prefix="VGG16", loadPretrain=0):
        Init Dataset, Model and others
        self.save_prefix = path_prefix
        self.num_classes = number_classes
        self.pretrained = None
        self.aff_dataset = ImageData(root_path=root_path, train_mode = "train")
        if model_name == "resnet50":
            self.model = resnet50(pretrained=(loadPretrain==1), num_classes = number_classes, model_path = self.pretrained )
        elif model_name == "vgg16":
            self.model = vgg16(pretrained=(loadPretrain==1), num_classes = number_classes, model_path = self.pretrained)

        if torch.cuda.device_count() > 1:
            print("There are ", torch.cuda.device_count(), "GPUs!")
            self.model = nn.DataParallel(self.model)
        elif torch.cuda.device_count() == 1:
            print("There is only one GPU")
            print("Only use CPU")
        if torch.cuda.is_available():

    def start_train(self, epoch=10, batch_size=50, learning_rate=0.001, batch_display=1000, save_freq=1):
        Detail of training
        self.epoch_num = epoch
        self.batch_size = batch_size
        self.lr = learning_rate
        #loss_function = nn.CrossEntropyLoss().cuda()
        #loss_function = nn.MSELoss().cuda()
        optimizer = optim.SGD(self.model.parameters(), lr=self.lr)

        for epoch in range(self.epoch_num):
            epoch_count = 0
            total_loss = 0
            dataloader = DataLoader(self.aff_dataset, batch_size=self.batch_size, shuffle=True,num_workers=8) # num_workers=8
            for i_batch, sample_batch in enumerate(dataloader):
                # Step.1 Load data and label
                images_batch, labels_batch = sample_batch['image'], sample_batch['label']
                for i in range(images_batch.shape[0]):
                    img_tmp = transforms.ToPILImage()(images_batch[i]).convert('RGB')
                labels_batch = torch.FloatTensor(labels_batch.view(-1,self.num_classes).numpy())
                if torch.cuda.is_available():
                    input_image = autograd.Variable(images_batch.cuda())
                    gtlabel = autograd.Variable(labels_batch.cuda(async=True))
                    input_image = autograd.Variable(images_batch)
                    gtlabel = autograd.Variable(labels_batch)
                # Step.2 calculate loss
                output = self.model(input_image)
                #loss = loss_function(output, gtlabel)
                loss = calLoss(output, gtlabel)
                epoch_count += 1
                total_loss += loss
                # Step.3 Update
                # Check Result
                if i_batch % batch_display == 0:
                    print("Epoch : %d Batch : %d, Loss : %f, " %(epoch, i_batch, loss))
                if i_batch % batch_display == 0:
                    pred_prob, pred_label = torch.max(output, dim=1)
                    print("Input Label : ", gtlabel[:4])
                    print("Output Label : ", pred_label[:4])
                    batch_correct = (pred_label == gtlabel).sum().data[0] * 1.0 / self.batch_size
                    print("Epoch : %d, Batch : %d, Loss : %f, Batch Accuracy %f" %(epoch, i_batch, loss, batch_correct))
            Save model
            print("Epoch %d Average Loss : %f" %(epoch, total_loss * self.batch_size / epoch_count))
            if epoch % save_freq == 0:
                torch.save(self.model.state_dict(), self.save_prefix+'_M-CCC_'+'SS_'+'%04d.pkl'%epoch)

test_train = Train() 

Would you like to add DropoutFC somewhere into Train?
Maybe I’ve missed it, but it seems not being used.
Is your model overfitting and you would like to regularize it with your module?

As a side note, it looks like your Train class might store the computation graph in every iteration thus leading to an increasing memory which finally might yiel an out of memory error.
If you need to store the loss values for debugging, you should just store the python float value instead of the tensor. Use this line instead:

total_loss += loss.item()

PS: I’ve formatted your post, since the code was not readable. You can add code blocks using three backticks.


Yes, there is the overfitting problem in the original version. So, I have to use the Dropbox codes.

How to add DropFC into Train?

Many thanks for your technical note.

Sorry, ptrblck. Just now I didn’t catch what you meant.

‘Would you like to add DropoutFC somewhere into Train?’ - what did you mean?

I have added DropoutFC on the top as shown above.

You meant I also should add this kind of codes somewhere else in the Train?

In the code you’ve posted you just created an instance of DropoutFC without using it.
I’m not sure, how you are using the module.

If you want to add a dropout and linear layer to your pre-trained model, you could use:

model = models.resnet50(pretrained=True)
model.fc = nn.Sequential(
    nn.Linear(2048, number_classes)

Okay, thanks! Let me try.