Problem During testing

Im trying to make a submission on the Kaggle segmentation challenge,Im facing a problem and im not sure how to appraoch this,When i run the test network with batch two i get a mask of the image and if i do np.unique() it comes out to be [0 1],image is attached.but when i run it with batch one and if i do np.unique() i only get 0’s,Im not sure why this is happening!

Any suggestions on how to fix this will be helpful!


I suppose you are predicting the mask using a softmax function?
I think you should pool in the channel dimension, not the batch dimension.

Assuming your output has the shape [batch, channels, w, h], you are calculating _, preds = torch.max(output, 0, keepdim=True), which calculates the maximal values in the current batch. If you only provide one image (batch=1), then of course the index 0 has the max value. :wink:

Try using torch.max(output, 1, keepdim=True).

Hello ptrblck,Thank you for your time

I tried doing torch.max(output,1, keepdim = True),after converting it to numpy when i do np.unique(), i only get [0],surprisingly when i set the batch size to 2 and print out output.size() i get [1,1,128,128],I was expecting [2,1,128,128],Anyways do you have any suggestions on this?


That’s strange. Could you please also print data.shape before passing it into the model?
If your output (without the max operation) hast shape [1, 1, 128, 128], something seems to be wrong with your model. Could you post the model definition, if it’s possible?

BTW, probably unrelated, but you should call just the model with your input: model(data) instead of model.forward(data).

Thanks for the reply,

Let me try this

The printed out the data.size it comes out to be [1,3,128,128],
I printed the training input(x_train) ,that too comes out to be [1,3,128,128]

The model def is

    class double_conv(nn.Module):
    '''(conv => BN => ReLU) * 2'''
    def __init__(self, in_ch, out_ch):
        super(double_conv, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_ch, out_ch, 3, padding=1),
            nn.Conv2d(out_ch, out_ch, 3, padding=1),

    def forward(self, x):
        x = self.conv(x)
        return x

class inconv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(inconv, self).__init__()
        self.conv = double_conv(in_ch, out_ch)

    def forward(self, x):
        x = self.conv(x)
        return x

class down(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(down, self).__init__()
        self.mpconv = nn.Sequential(
            double_conv(in_ch, out_ch)

    def forward(self, x):
        x = self.mpconv(x)
        return x

class up(nn.Module):
    def __init__(self, in_ch, out_ch, bilinear=True):
        super(up, self).__init__()

        if bilinear:
            self.up = nn.Upsample(scale_factor=2)
            self.up = nn.ConvTranspose2d(in_ch, out_ch, 2, stride=2)

        self.conv = double_conv(in_ch, out_ch)

    def forward(self, x1, x2):
        x1 = self.up(x1)
        diffX = x1.size()[2] - x2.size()[2]
        diffY = x1.size()[3] - x2.size()[3]
        x2 = F.pad(x2, (diffX // 2, int(diffX / 2),
                        diffY // 2, int(diffY / 2)))
        x =[x2, x1], dim=1)
        x = self.conv(x)
        return x

class outconv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(outconv, self).__init__()
        self.conv = nn.Conv2d(in_ch, out_ch, 1)

    def forward(self, x):
        x = self.conv(x)
        return x

class UNet(nn.Module):
    def __init__(self, n_channels, n_classes):
        super(UNet, self).__init__() = inconv(n_channels, 64)
        self.down1 = down(64, 128)
        self.down2 = down(128, 256)
        self.down3 = down(256, 512)
        self.down4 = down(512, 512)
        self.up1 = up(1024, 256)
        self.up2 = up(512, 128)
        self.up3 = up(256, 64)
        self.up4 = up(128, 64)
        self.outc = outconv(64, n_classes)

    def forward(self, x):
        x1 =
        x2 = self.down1(x1)
        x3 = self.down2(x2)
        x4 = self.down3(x3)
        x5 = self.down4(x4)
        x = self.up1(x5, x4)
        x = self.up2(x, x3)
        x = self.up3(x, x2)
        x = self.up4(x, x1)
        x = self.outc(x)
        x = t.nn.functional.sigmoid(x)
        #x = t.nn.functional.softmax(x)
        return x

def soft_dice_loss(inputs, targets):
        num = targets.size(0)
        m1  = inputs.view(num,-1)
        m2  = targets.view(num,-1)
        intersection = (m1 * m2)
        score = 2. * (intersection.sum(1)+1) / (m1.sum(1) + m2.sum(1)+1)
        score = 1 - score.sum()/num
        return score

model = UNet(3,1).cuda()
optimizer = t.optim.Adam(model.parameters(),lr = 1e-3)
#configure("runs/run-1", flush_secs=2)
for epoch in range(1):
    for x_train, y_train  in tqdm(dataloader):
        x_train = t.autograd.Variable(x_train).cuda()
        y_train = t.autograd.Variable(y_train).cuda()
        o = model.forward(x_train)
        loss = soft_dice_loss(o, y_train)
        log_value('TrainLoss', loss, epoch)

the testing part is

model = model.eval()
predictions =[]
for data in testdataloader:
data = t.autograd.Variable(data, volatile=True).cuda()
output = model(data)
first =
#print (np.mean(first))
_,preds = t.max(output, 0,keepdim=True)

Let me know if you have any suggestions,Thanks

It seems your model returns just 1 channel using the sigmoid function.
You shouldn’t apply torch.max() on this, but just a threshold.
torch.max is useful, when your model returns predictions in multiple channels, each channel giving the “probability” of the class. In your case you just have a binary problem, i.e. 1 channel, where the prediction > threshold indicates class 1 and prediction < threshold class 0.

Could you check again, that your DataLoader returns the same number of images as set by batch_size?
I.e.: for batch_size=2 your data or x_train should have the shape [2, ...].

Thanks for giving me clarity with the torch.max() ,i will try to get the thresholding working,

With regards to the dataloader,you are right ,when i set the batch size to be 2,it returns [2, 3, 128, 128]