Is thIs the problem with the model or something is wrong in my training process or the way I am giving input to the model?

I have used fcn8s model with vgg backbone for image segmentation task on the pascal VOC 2012 dataset but after training the model I am unable to get the desired results of segmentation. I have used this link

for making the fcn model with the dataset that is available on torchvision.datasets

These are some of the screenshots of my project. I couldn’t figure out where the problem is since the model has minimal loss but the segmentation is not working as I have wished.

What is not working at the moment, i.e. are the outputs not as expected?
Did the model work fine during training and validation (were the outputs reasonable)?

With the use of unet and fcn vgg model, i am unable to get desired accuracy on pascal voc 2012 segmentation dataset, with 10 epoch i am only getting acc. starting from 27% to 45%. I have used the dataset available on PyTorch datasets as

train_set = torchvision.datasets.VOCSegmentation(root='drive/My Drive/VOC',

and batch of 64 as:

train_loader = data.DataLoader(train_set, batch_size=64, shuffle=True)

and my is:

train_Accuracy = []
train_Loss = []
dictionary = {}
for epoch in range(8):
  total_loss = 0
  total_correct = 0
  total_train = 0
  correct_train = 0
  ts = time.time()

  for batch in train_loader:
    images, labels = batch


    preds = model(images)
    loss = criterion(preds, labels)


    total_loss += loss.item()
    # total_correct += get_num_correct(preds, labels)

    _, predicted = torch.max(, 1)
    total_train += labels.nelement()
    correct_train += predicted.eq(
    train_accuracy = 100 * correct_train / total_train

  train_Loss.append(round(loss.item(), 5))
  train_Accuracy.append(round(train_accuracy, 4))

  dictionary = {
      "epoch": epoch + 1,
      "loss": round(loss.item(), 5),
      "accuracy": round(train_accuracy, 4),
      "training time": time.time() - ts,
      "model": model.state_dict()
  }, 'data_to_be_saved_{}.pth'.format(epoch+1))
  print('epoch: ', epoch + 1,
        # ' total_correct: ', total_correct,
        # ' loss: ', total_loss/len(data_loader),
        ' loss: ', round(loss.item(), 5),
        ' accuracy: ', round(train_accuracy, 4),
        ' training time: ', time.time() - ts)

I would recommend to try to overfit a small data sample and make sure your model can successfully predict these samples.
If that’s not the case, there might be a bug in the code I’m missing or the architecture is not suitable for the task.

Thanks, Sir, the problem was on my dataset and I fixed it, with this the accuracy is quite good but using that model and segmenting a single image is just not working. trans_img_tensor = T.Compose([ T.Resize(256), T.CenterCrop(224), T.ToTensor() ]) image ='rose.jpeg') a = trans_img_tensor(image) c = a.unsqueeze(0) label = model(c) trans_tensor_2_img = T.Compose([ T.ToPILImage() ]) d = trans_tensor_2_img(label.argmax(1).float()) d

Your code is a bit hard to read, but did you call model.eval() before trying to test the model?

Sir I am trying to load a saved model and to use it on the image. I have used


to save the model and using

model = TheModelClass(*args, **kwargs)

to load the model but the *args and **kwargs are not defined, what to use on them, I have read your previous answer on this but didn’t quite able to find what to use on them.
The model used above is a CNN and I am trying to load an unet.

class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# Initialize model
model = TheModelClass()

The args and kwargs are just placeholders in case your model’s __init__ takes some arguments.
Since yours only uses self, you don’t need to pass anything to the initialization.

I was successful in loading the model with your guidance but using this

def image_segmentation(path):
  image =
  trans_image = trans(image)
  dim_added_image = trans_image.unsqueeze(0)
  segment = model(dim_added_image)
  segmented_image_tensor = segment.argmax(1).float()
  segmented_image = trans_2_img(segmented_image_tensor)
  return segmented_image

as well I am not able to segment the image that I feed to it.