Inconsistent tensor sizes at the last layer of SegNet

Hi @ptrblck , thanks a lot for your help.

I changed code as you mentioned. But , it seems to be consuming a lot of memory. ( I do not have GPU ).

Here is my modified code :

 for epoch in range(1, num_epochs+1):
    epoch_loss = []
    iteration=1
    for step, (images, labels) in enumerate(trainLoader):
        print("Iter:"+str(iteration))
        im_arr = np.array(labels)
        #print("Labels ==> ",np.unique(np.array(labels)))
        im_arr[(im_arr!=0) & (im_arr!=15) & (im_arr!=40)] = 0
        
        # Set to classes
        im_arr[im_arr==15] = 1
        im_arr[im_arr==40] = 2
        #print(np.unique(im_arr))
        
        iteration=iteration+1
        inputs = Variable(images)
        targets = Variable(torch.from_numpy(im_arr))
        
        outputs = model(inputs)
        optimizer.zero_grad()
        #print("outputs size ==> ",outputs.size())
        
        #print("outputs[:,0] size ==> ",outputs[:,0].size())
        #print("targets[:, 0] size ==> ",targets[:, 0].size())


        #loss = criterion(outputs, targets[:, 0])
        
        #print(targets[:,0])
        loss = criterion(outputs,targets[:,0])
        
        loss.backward()
        optimizer.step()
        epoch_loss.append(loss.data[0])
        
        average = sum(epoch_loss) / len(epoch_loss)
        
        print("loss: "+str(average)+" epoch: "+str(epoch)+", step: "+str(step))

Here is out-of-memory issues :


Iter:1
/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:180: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.Upsample instead.
  warnings.warn("nn.UpsamplingBilinear2d is deprecated. Use nn.Upsample instead.")
/opt/anaconda/lib/python3.6/site-packages/torch/nn/functional.py:1423: UserWarning: nn.functional.upsample_bilinear is deprecated. Use nn.functional.upsample instead.
  warnings.warn("nn.functional.upsample_bilinear is deprecated. Use nn.functional.upsample instead.")
/opt/anaconda/lib/python3.6/site-packages/ipykernel_launcher.py:13: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  del sys.path[0]
loss: 1.3778443336486816 epoch: 1, step: 0
Iter:2
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-25-346a5ab069f5> in <module>()
     17         targets = Variable(torch.from_numpy(im_arr))
     18 
---> 19         outputs = model(inputs)
     20         optimizer.zero_grad()
     21         #print("outputs size ==> ",outputs.size())

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

<ipython-input-5-72a9e198fd06> in forward(self, x)
     54         #print('layer2', x.size())
     55 
---> 56         x = self.layer3(x)
     57         #print('layer3', x.size())
     58 

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
     65     def forward(self, input):
     66         for module in self._modules.values():
---> 67             input = module(input)
     68         return input
     69 

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

/opt/anaconda/lib/python3.6/site-packages/torchvision/models/resnet.py in forward(self, x)
     74         residual = x
     75 
---> 76         out = self.conv1(x)
     77         out = self.bn1(out)
     78         out = self.relu(out)

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
    275     def forward(self, input):
    276         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 277                         self.padding, self.dilation, self.groups)
    278 
    279 

/opt/anaconda/lib/python3.6/site-packages/torch/nn/functional.py in conv2d(input, weight, bias, stride, padding, dilation, groups)
     88                 _pair(0), groups, torch.backends.cudnn.benchmark,
     89                 torch.backends.cudnn.deterministic, torch.backends.cudnn.enabled)
---> 90     return f(input, weight, bias)
     91 
     92 

RuntimeError: $ Torch: not enough memory: you tried to allocate 0GB. Buy new RAM! at /opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/TH/THGeneral.c:246
​

Am I doing anything wrong here ? or is it just memory issue ?
I am planning to move the code to a server having GPU. Will it solve this issue ?

How much RAM does your machine have? I doubt the GPU will have more.
How large is the batch size? Could you lower it?
Also, you are using a lot of channels in your conv layers. Try to half them just to see if it’s working.

1 Like

Hi @ptrblck , thanks for your reply.

This is my memory information


free
              total        used        free      shared  buff/cache   available
Mem:       65856408     4487936    58792948        4880     2575524    60738088
Swap:      31249404       77076    31172328

Also ,
batch_size=1 I have taken.

Regarding number of channels , I am building PSPNet on top of Resnet_101 , so , I need to take the exact number of channels which Resnet produces , right ?

Could you please help me , where can I reduce the number of channels in my code ?

Thanks in advance.

As a side note:
Try to move the “target fix” to the Dataset and remove it from the training loop or alternatively use pure PyTorch code im the training loop.

You have 66GB of RAM so a GPU won’t help if you run out of memory on this machine.
Do you have the code hosted on GitHub so that I could try to run it?

hi @ptrblck

here is the code

I will check the code relocation part and update you.

Thanks in advance :slight_smile:

Sorry for the late reply, I just run it on a machine with 256GB RAM and the forward pass for batch_size=1 took approx. 53.2 GB.
Skimming your code I can’t see anything wrong. Your model just seems to be very big.

Here is another implementation of SegNet, which just takes ~1GB for batch_size=1.
Could you have a look at it and compare both architectures?

I’m not familiar with the implementation of SegNet, but there are some differences between your approach and the aforementioned implementation.

Hi , thanks a lot @ptrblck

I am using ResNet , even though the question reads “SegNet” ( my bad , I had tried SegNet earlier , later changed on to ResNet)

I dismantled your model and the image size of [1296, 966] is just huge.
Could you try to cut resize the image and adapt the model to it?
Alternatively you could slice the image and use a “windowed” approach.

Would this work for you?

Slice the image , as in ?

The UNet paper used this approach so save memory and use arbitrarily large images.
Figure 2

They called it “Overlap-tile strategy”, so sorry for the confusion. I couldn’t find a good name so just used slice. :wink:

Hi @ptrblck ,
thanks a lot… :slight_smile:

Hi @ptrblck

thanks a lot for all the help you did till now. Could you please help me in evaluating in this model ?

I am getting


TypeError: Performing basic indexing on a tensor and encountered an error indexing dim 0 with an object of type Variable. The only supported types are integers, slices, numpy scalars, or if indexing with a torch.LongTensor or torch.ByteTensor only a single Tensor may be passed.

while converting the label to color_tranform

Here is full exception stack trace :


size of gray image  torch.Size([1, 3, 136, 136])
color_image shape  torch.Size([3, 3, 136])
self.cmap shape  torch.Size([3, 3])
Mask ==>  torch.Size([3, 136, 136])  label ==>  1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-177-66d081c46bdc> in <module>()
----> 1 label = color_transform(label)

<ipython-input-173-97981ebf558b> in __call__(self, gray_image)
     43 
     44             print("Mask ==> ",mask.shape," label ==> ",label)
---> 45             color_image[0][mask] = self.cmap[label][0]
     46             color_image[1][mask] = self.cmap[label][1]
     47             color_image[2][mask] = self.cmap[label][2]

TypeError: Performing basic indexing on a tensor and encountered an error indexing dim 0 with an object of type Variable. The only supported types are integers, slices, numpy scalars, or if indexing with a torch.LongTensor or torch.ByteTensor only a single Tensor may be passed.

Here is my code for labelling with different color for semantic segmentation :

def colormap(n):
    cmap=np.zeros([n, 3]).astype(np.uint8)

    for i in np.arange(n):
        r, g, b = np.zeros(3)

        for j in np.arange(8):
            r = r + (1<<(7-j))*((i&(1<<(3*j))) >> (3*j))
            g = g + (1<<(7-j))*((i&(1<<(3*j+1))) >> (3*j+1))
            b = b + (1<<(7-j))*((i&(1<<(3*j+2))) >> (3*j+2))

        cmap[i,:] = np.array([r, g, b])

    return cmap

class Relabel:

    def __init__(self, olabel, nlabel):
        self.olabel = olabel
        self.nlabel = nlabel

    def __call__(self, tensor):
        
        assert isinstance(tensor, torch.LongTensor), 'tensor needs to be LongTensor'
        
        for i in tensor:
            i[i == self.olabel] = self.nlabel
        
        #print("tensor ",tensor)
        #print("\n olabel ",self.olabel)
        #tensor[tensor == self.olabel] = self.nlabel
        return tensor


class ToLabel:

    def __call__(self, image):
        
#         tensors = []
#         for i in image:
#             tensors.append(torch.from_numpy(np.array(i)).long())
#         return tensors
        return torch.from_numpy(np.array(image)).long().unsqueeze(0)

class Colorize:

    def __init__(self, n=3):
        #self.cmap = colormap(256)
        self.cmap = colormap(136)
        #self.cmap = colormap(3)
        self.cmap[n] = self.cmap[-1]
        self.cmap = torch.from_numpy(self.cmap[:n])

    def __call__(self, gray_image):
        size = gray_image.size()
        print("size of gray image ",size)
        color_image = torch.ByteTensor(3, size[1], size[2]).fill_(0)
        
        #color_image = torch.ByteTensor(3, size[0], size[1]).fill_(0)
        print("color_image shape ",color_image.shape)
        print("self.cmap shape ",self.cmap.shape)
        for label in range(1, len(self.cmap)):
            mask = (label == gray_image[0]).cpu()
            
#             mask = gray_image[0] == label
#             color_image[0][mask] = self.cmap[label][0]
#             color_image[1][mask] = self.cmap[label][1]
#             color_image[2][mask] = self.cmap[label][2]

            print("Mask ==> ",mask," label ==> ",label)    
            color_image[0][mask] = self.cmap[label][0]
            color_image[1][mask] = self.cmap[label][1]
            color_image[2][mask] = self.cmap[label][2]

        return color_image

Here is the code for transformation :


color_transform = Colorize()
image_transform = ToPILImage()

input_transform = Compose([
    CenterCrop(256),
    Scale(136),
    ToTensor(),
    Normalize([.485, .456, .406], [.229, .224, .225]),
])
target_transform = Compose([
    CenterCrop(256),
    Scale(136),
    ToLabel(),
    Relabel(255, 3),
])

Here is code for model evaluation :


model.eval()
image = input_transform(Image.open('dataset-1.0/'+'images/060_resize.png')) 
label = model(Variable(image, volatile=True).unsqueeze(0))
label = color_transform(label)

Could you please help ?

It seems there are some issues regarding the shapes.
color_image has a shape of [3, 3, 136], which seeme weird.
Since the mask is [3, 136, 136], I assume the color_image should have the same dimensions.
Also, it seems you are trying to slice the channels from the color_image and apply the mask on each slice.
If you mask already has 3 channels, you don’t need to do this. Alternatively your mask should only have a width and height.

If you would like to assign a value using the mask, you could use color_image.masked_fill_(mask, value).

Also, you should update to PyTorch 0.4.0, since it’s the latest stable release! :wink:

hi @ptrblck , thanks for your reply.

Actually , I am bit confused .
I have some doubts , it would be really great if could address them :

  1. My input transformation is of 136 .
    So , will my colormap be of size 136 ?

  2. Should I pass output[0].max(0)[1] to the color_transform function ?
    What is the advantage of this ?

  1. I think your mask should have the same height and width as the color_image it is masking, i.e. [1, 136, 136] or [3, 136, 136].
    This line should most likely be changed:
    color_image = torch.ByteTensor(3, size[1], size[2]).fill_(0)
    to
    color_image = torch.ByteTensor(3, size[2], size[3]).fill_(0).

  2. I thing it’s alright to pass label to color_transform.

Try out this code:

class Colorize:

    def __init__(self, n=3):
        self.cmap = colormap(136)
        self.cmap[n] = self.cmap[-1]
        self.cmap = torch.from_numpy(self.cmap[:n])

    def __call__(self, gray_image):
        size = gray_image.size()
        print("size of gray image ",size)
        color_image = torch.ByteTensor(3, size[2], size[3]).fill_(0)
        
        #color_image = torch.ByteTensor(3, size[0], size[1]).fill_(0)
        print("color_image shape ",color_image.shape)
        print("self.cmap shape ",self.cmap.shape)
        for label in range(1, len(self.cmap)):
            mask = (label == gray_image[0]).cpu()

            print("Mask ==> ",mask," label ==> ",label)    
            color_image[0].masked_fill_(mask[0], self.cmap[label][0])
            color_image[1].masked_fill_(mask[1], self.cmap[label][1])
            color_image[2].masked_fill_(mask[2], self.cmap[label][2])

        return color_image
    

color_transform = Colorize()

x = torch.zeros(1, 3, 24, 24).random_(3)
color = color_transform(x)

plt.imshow(x[0].numpy().transpose(1, 2, 0))
plt.imshow(color.numpy().transpose(1, 2, 0))

Hi @ptrblck , thanks for your help. The code snippet which you have provided , is working.
However , I am still finding it bit difficult to incorporate it in actual code.

I will ping you with the issues.

Thanks a lot again :slight_smile:

If your model output is in this shape [batch_size, classes, height, width], you could do the following:

batch_size, classes, h, w = 10, 3, 24, 24
output = torch.randn(batch_size, classes, h, w)
pred = torch.argmax(output, 1)

Hi @ptrblck

I used the sample code provided by you. It is working after a minor modification ( changing mask to mask.data ).

But , in output semantic segmented image everything coming as black.

Do I need to preprocess the image ?

Here are my input labels I am using for training the model :


        im_arr = np.array(labels)
        #print("Labels ==> ",np.unique(np.array(labels)))
        im_arr[(im_arr==0) | (im_arr==11) | (im_arr==12) | (im_arr==13) | (im_arr==14) | (im_arr==15) | (im_arr==16)]=0
        
        im_arr[(im_arr==17) | (im_arr==18) | (im_arr==19) | (im_arr==20) | (im_arr==21) | (im_arr==22) | (im_arr==23) | (im_arr==24) | (im_arr==25) | (im_arr==28) | (im_arr==29) | (im_arr==30) ]=0
        
        im_arr[(im_arr==34)| (im_arr==35) | (im_arr==40) | (im_arr==41)]=2

Update :slight_smile:

Hi , I am classifying the classes as follows :

label_np = label.data.numpy()
np.amax(label_np) #3.9837084
np.amin(label_np) # -1.5343161

label_np[label_np<0]=0
label_np[label_np>3]=2

label_np[label_np==0].shape #(29490,)

label_np[np.logical_and(label_np>0.6,label_np<=3)]=1
label_np[np.logical_and(label_np>0,label_np<=0.6)]=0

color1 = color_transform(torch.from_numpy(label_np))
temp_image1 = image_transform(color1)
temp_image1.save("temp_image_00001.png")

Is this is correct approach ?

Please help.