Hello everyone, I am using Unet for Image segmentation.
At the begining, the most basic training strategy was used to train the model. And the results is acceptable. The code of the train function is below.
‘’’
def train_model(model, optimizer, loss_fn, dataloader,args):
model.train()
for epoch in range(args.num_epochs):
print('Epoch {}/{}'.format(epoch, args.num_epochs - 1))
print('-' * 10)
epoch_loss = 0
step = 0
acc= 0
for i , (train_batch, labels_train ) in enumerate(dataloader):
step += 1
train_batch, labels_train = Variable(train_batch), Variable(labels_train )
labels_train = labels_train.float()
optimizer.zero_grad()
output_batch = model(train_batch)
output_flat = output_batch.view(-1)
true_flat = labels_train.view(-1)
loss = loss_fn(output_flat,true_flat)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
epoch +=1
makeDirectory(path,args.num_epochs)
savepath = path+'/'+savename+'_'+str(epoch)+'/'+savename+'_'+str(epoch)+'.pth'
torch.save(model.state_dict(), savepath)
return model
’ ’ ’
Because I want to improve the results, and want to optimize the model in a statistical way, so I added something so that the model could be optimized according to the minimum loss.
’ ’ ’
def train_model(model, optimizer, loss_fn, dataloader,args):
is_better = True
min_loss = float(‘inf’)
model.train()
for epoch in range(args.num_epochs):
print(‘Epoch {}/{}’.format(epoch, args.num_epochs - 1))
print(’-’ * 10)
epoch_loss = 0
step = 0
acc= 0
for i , (train_batch, labels_train ) in enumerate(dataloader):
step += 1
train_batch, labels_train = Variable(train_batch), Variable(labels_train )
labels_train = labels_train.float()
optimizer.zero_grad()
output_batch = model(train_batch)
output_flat = output_batch.view(-1)
true_flat = labels_train.view(-1)
loss = loss_fn(output_flat,true_flat)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
if is_better:
min_loss = epoch_loss
best_model = model.state_dict()
epoch +=1
makeDirectory(path,args.num_epochs)
savepath = path+'/'+savename+'_'+str(epoch)+'/'+savename+'_'+str(epoch)+'.pth'
torch.save(best_model, savepath)
return model
‘’’
But the strange thing is, the epoch_loss kept jumping in a large scale. When I used the original code, there were some fluctuations in the loss, but the trend is in declining way, and the fluctuations only happens after more than 100 epochs. I don’t know why it does not work in this way.
If I use some validation set ,which has a very small number of pictures, like one image, and use the validation loss/accuracy to adjust the model parameter, is that more effective than only using the training set? If the validation function is called in the train function. should I add with torch.no_grad() in the validation function?
Thanks in advance