On dropout-enhanced CNN training codes

Qiang_Sun · July 27, 2018, 5:10pm

Hi all,

I just added the dropout codes in order to avoid the overfitting problem from the basic CNN training codes as follows:

class DropoutFC(nn.Module):
def init(self):
super(DropoutFC, self).init()
self.fc = nn.Linear(100,20)
self.dropout = nn.Dropout(p=0.5)

def forward(self, input):
  out = self.fc(input)
  out = self.dropout(out)
  return out

Net = DropoutFC()
Net.train()

My question is therefore, besides the codes above, if additional codes should be added into the original python file anywhere else?

By the way, the codes to use the class of Train is also shown here:
test_train = Train()
test_train.start_train(epoch=28,batch_display=2000,batch_size=50,save_freq=1)

ptrblck · July 28, 2018, 7:58am

I’m not sure I understand your question.
You have created this module to combine a linear and Dropout layer, both with fixed parameters.

It looks like your model just consists of this DropoutFC layer. Usually you put Dropout between two consecutive linear layers. Currently you would drop some of your outputs.

What do you mean by “original python file” ?
Where would you like to put this code?

What class is Train()? It looks like a high-level wrapper.

Qiang_Sun · July 28, 2018, 8:16am

Hi ptrblck,

Thanks for your reply. By ‘originial python file’ I mean the training codes without using the dropout techniques. Below, I show you the enhanced version where the codes at the beginning part, including several import *** and the definition of one performance metric, have been omitted.

class DropoutFC(nn.Module):
    def __init__(self):
      super(DropoutFC, self).__init__()
      self.fc = nn.Linear(100,20)
      self.dropout = nn.Dropout(p=0.5)

    def forward(self, input):
      out = self.fc(input)
      out = self.dropout(out)
      return out
Net = DropoutFC() 
Net.train()

class Train:
    def __init__(self, root_path = "/home/simon/Downloads/TrainFramesSS/", model_name = "vgg16", number_classes = 2, path_prefix="VGG16", loadPretrain=0):
        """
        Init Dataset, Model and others
        """
        self.save_prefix = path_prefix
        self.num_classes = number_classes
        self.pretrained = None
        self.aff_dataset = ImageData(root_path=root_path, train_mode = "train")
        if model_name == "resnet50":
            self.model = resnet50(pretrained=(loadPretrain==1), num_classes = number_classes, model_path = self.pretrained )
        elif model_name == "vgg16":
            self.model = vgg16(pretrained=(loadPretrain==1), num_classes = number_classes, model_path = self.pretrained)

        if torch.cuda.device_count() > 1:
            print("There are ", torch.cuda.device_count(), "GPUs!")
            self.model = nn.DataParallel(self.model)
            
        elif torch.cuda.device_count() == 1:
            print("There is only one GPU")
            
        else:
            print("Only use CPU")
        
        
        
        if torch.cuda.is_available():
           self.model.cuda()

    def start_train(self, epoch=10, batch_size=50, learning_rate=0.001, batch_display=1000, save_freq=1):
        """
        Detail of training
        """
        self.epoch_num = epoch
        self.batch_size = batch_size
        self.lr = learning_rate
        
        #loss_function = nn.CrossEntropyLoss().cuda()
        #loss_function = nn.MSELoss().cuda()
        optimizer = optim.SGD(self.model.parameters(), lr=self.lr)

        for epoch in range(self.epoch_num):
            epoch_count = 0
            total_loss = 0
            dataloader = DataLoader(self.aff_dataset, batch_size=self.batch_size, shuffle=True,num_workers=8) # num_workers=8
            
            for i_batch, sample_batch in enumerate(dataloader):
 
                # Step.1 Load data and label
                images_batch, labels_batch = sample_batch['image'], sample_batch['label']
                """
                for i in range(images_batch.shape[0]):
                    img_tmp = transforms.ToPILImage()(images_batch[i]).convert('RGB')
                    plt.imshow(img_tmp)
                    plt.pause(0.001)
                """                   
                labels_batch = torch.FloatTensor(labels_batch.view(-1,self.num_classes).numpy())
                if torch.cuda.is_available():
                    input_image = autograd.Variable(images_batch.cuda())
                    gtlabel = autograd.Variable(labels_batch.cuda(async=True))
                else:
                    input_image = autograd.Variable(images_batch)
                    gtlabel = autograd.Variable(labels_batch)
                # Step.2 calculate loss
                output = self.model(input_image)
                #loss = loss_function(output, gtlabel)
                loss = calLoss(output, gtlabel)
                epoch_count += 1
                total_loss += loss
                # Step.3 Update
                optimizer.zero_grad()
                loss.backward()
                optimizer.step()
                # Check Result
                if i_batch % batch_display == 0:
                    print("Epoch : %d Batch : %d, Loss : %f, " %(epoch, i_batch, loss))
                '''
                if i_batch % batch_display == 0:
                    pred_prob, pred_label = torch.max(output, dim=1)
                    print("Input Label : ", gtlabel[:4])
                    print("Output Label : ", pred_label[:4])
                    batch_correct = (pred_label == gtlabel).sum().data[0] * 1.0 / self.batch_size
                    print("Epoch : %d, Batch : %d, Loss : %f, Batch Accuracy %f" %(epoch, i_batch, loss, batch_correct))
                '''
            """
            Save model
            """
            print("Epoch %d Average Loss : %f" %(epoch, total_loss * self.batch_size / epoch_count))
            if epoch % save_freq == 0:
                torch.save(self.model.state_dict(), self.save_prefix+'_M-CCC_'+'SS_'+'%04d.pkl'%epoch)

test_train = Train() 
test_train.start_train(epoch=28,batch_display=2000,batch_size=50,save_freq=1)

ptrblck · July 28, 2018, 8:36am

Would you like to add DropoutFC somewhere into Train?
Maybe I’ve missed it, but it seems not being used.
Is your model overfitting and you would like to regularize it with your module?

As a side note, it looks like your Train class might store the computation graph in every iteration thus leading to an increasing memory which finally might yiel an out of memory error.
If you need to store the loss values for debugging, you should just store the python float value instead of the tensor. Use this line instead:

total_loss += loss.item()

PS: I’ve formatted your post, since the code was not readable. You can add code blocks using three backticks.

Qiang_Sun · July 28, 2018, 8:46am

Well. THANKS!

Yes, there is the overfitting problem in the original version. So, I have to use the Dropbox codes.

How to add DropFC into Train?

Many thanks for your technical note.

Qiang_Sun · July 28, 2018, 8:55am

Sorry, ptrblck. Just now I didn’t catch what you meant.

‘Would you like to add DropoutFC somewhere into Train?’ - what did you mean?

I have added DropoutFC on the top as shown above.

You meant I also should add this kind of codes somewhere else in the Train?

ptrblck · July 28, 2018, 10:15pm

In the code you’ve posted you just created an instance of DropoutFC without using it.
I’m not sure, how you are using the module.

If you want to add a dropout and linear layer to your pre-trained model, you could use:

model = models.resnet50(pretrained=True)
model.fc = nn.Sequential(
    nn.Dropout(p=0.5)
    nn.Linear(2048, number_classes)
)

Qiang_Sun · July 30, 2018, 2:50pm

Okay, thanks! Let me try.