Error: Expected more than 1 value per channel when training

Thanks for the response. I checked the implementation and the error seems to originate from an Instancenorm layer. The layer is set to False after calling model.eval(). Please suggest what I can do further. Another error I’ve noticed is that the dice coefficient that I use as my metric gives a value of more than 1 on the last few epochs. What could I be doing differently? Here’s the full error

ipython-input-19-4be8d4a14b5d> in <module>
     36 
     37         with torch.no_grad():
---> 38             outputs = model(window)
     39 
     40         outputs = outputs.cpu().numpy()

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/workspace/sharad/Final/mod3DUnet.py in forward(self, x)
    204         out = self.conv3d_c5(out)
    205         residual_5 = out
--> 206         out = self.norm_lrelu_conv_c5(out)
    207         out = self.dropout3d(out)
    208         out = self.norm_lrelu_conv_c5(out)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/instancenorm.py in forward(self, input)
     55         return F.instance_norm(
     56             input, self.running_mean, self.running_var, self.weight, self.bias,
---> 57             self.training or not self.track_running_stats, self.momentum, self.eps)
     58 
     59 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in instance_norm(input, running_mean, running_var, weight, bias, use_input_stats, momentum, eps)
   2033                 running_var=running_var, weight=weight, bias=bias,
   2034                 use_input_stats=use_input_stats, momentum=momentum, eps=eps)
-> 2035     _verify_batch_size(input.size())
   2036     return torch.instance_norm(
   2037         input, weight, bias, running_mean, running_var,

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in _verify_batch_size(size)
   1993         size_prods *= size[i + 2]
   1994     if size_prods == 1:
-> 1995         raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
   1996 
   1997 

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 128, 1, 1, 1])

nn.InstanceNorm*d doesn’t track running stats by default, so you might want to enable it.
However, I cannot reproduce this issue using your reported shapes:

norm = nn.InstanceNorm3d(128)
x = torch.randn(1, 128, 1, 1, 1)
out = norm(x)

norm.eval()
out = norm(x)

Thanks, will try that!

Hi sir,

I tried using the code snippet you mention here it does not work in either of the modes. Also I checked the code for instance normalization and putting it here for reference:

def instance_norm(input, running_mean=None, running_var=None, weight=None,
                  bias=None, use_input_stats=True, momentum=0.1, eps=1e-5):
    # type: (Tensor, Optional[Tensor], Optional[Tensor], Optional[Tensor], Optional[Tensor], bool, float, float) -> Tensor  # noqa
    r"""Applies Instance Normalization for each channel in each data sample in a
    batch.
    See :class:`~torch.nn.InstanceNorm1d`, :class:`~torch.nn.InstanceNorm2d`,
    :class:`~torch.nn.InstanceNorm3d` for details.
    """
    if not torch.jit.is_scripting():
        if type(input) is not Tensor and has_torch_function((input,)):
            return handle_torch_function(
                instance_norm, (input,), input, running_mean=running_mean,
                running_var=running_var, weight=weight, bias=bias,
                use_input_stats=use_input_stats, momentum=momentum, eps=eps)
    _verify_batch_size(input.size())
    return torch.instance_norm(
        input, weight, bias, running_mean, running_var,
        use_input_stats, momentum, eps, torch.backends.cudnn.enabled
    )

Interestingly we do not check the mode (training or evaluation) when we use the _verify_batch_size function for instance norm although we are doing it for batchnorm. Is there a specific reason for it?

Thank you so much for your time!

Regards
Harsh

You are right and I get a proper error now. Based on this image:

I don’t think that InstanceNorm2d should be able to calculate the output using a single pixel, since the stats are calculated in H, W.

I’ve updated this issue to track it.

2 Likes

Thank you for your quick response.

I’m using try & except because it only fails on the last batch. Do I need to convert the bn back to training mode if it only fails on the last batch? Will this mess up back propagation?

    def __init__(self, inchan, outchan):
        super(SiameseFCLayer, self).__init__()
        self.fc = nn.Linear(inchan, outchan)
        self.bn = nn.BatchNorm1d(outchan)
    def forward(self, x):
        out = self.fc(x)
        out = nn.functional.sigmoid(out)
        try:
            self.bn(out)
        except:
            self.bn.eval()
            self.bn(out)
            self.bn.train()
        return out

I would drop the last batch if it only contains a single sample via drop_last=True in the DataLoader.

What is the name of the book the attached picture was extracted from?

It’s Figure 2 from the Group Normalization paper.

2 Likes

Thanks for the shared link

what a lifesaver, thx

I get the same error on training using more multiple gpus, but there is no error when trained on a single gpu.
Is there any fix for this?

Can any one here can explain why I get this error only when I set the batch size to: 1, 2, 3, 4, 5, 7, 10, 13 and train the model on 4 GPUs
And also get the error message when I set the batch size to 1, 2, 3, 5, 7 and train on 3 GPUs

If the problem is because using nn.BatchNorm should give more then 1 value for calculate the mean and std. Then why 10 & 13 on 4GPUs or 7 on 3GPUs is not working? I ready set drop_last=True in DataLoader

@ptrblck I have the same problem you are referring, I use BatchNorm, I tried to use InstanceNorm1D or 3D, however it didn’t work for the training data I’m using or the architecture that I currently have. I wanted to see if you know if there is any way I could get rid of that batch. Knowing that I divided the training data into batches by using DataLoader.

Assuming the last batch is smaller than the rest and thus creates an activation with a single pixel, you could remove it by using drop_last=True in the DataLoader.

1 Like

Hello @ptrblck_de Sir, I have the following code for PSPNet and I am getting an error as:
“ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512, 1, 1])” when testing the model with a random input. I need to train this model with my dataset having images of size [3, 512, 512]

class PyramidPool(nn.Module):

def __init__(self, in_features, out_features, pool_size):
	super(PyramidPool,self).__init__()

	self.features = nn.Sequential(
		nn.AdaptiveAvgPool2d(pool_size),
		nn.Conv2d(in_features, out_features, 1, bias=None),
		nn.BatchNorm2d(out_features, momentum=.95),
		nn.ReLU(inplace=True)
	)


def forward(self, x):
	size=x.size()
	output=F.upsample(self.features(x), size[2:], mode='bilinear')
    # print(output.size())
	return output

class PSPNet(nn.Module):

def __init__(self, num_channels, num_classes, pretrained = False):
    super(PSPNet,self).__init__()
    print("initializing model")
    self.num_channels = num_channels
    self.num_classes = num_classes
    #init_net=deeplab_resnet.Res_Deeplab()
    #state=torch.load("models/MS_DeepLab_resnet_trained_VOC.pth")
    #init_net.load_state_dict(state)
    self.resnet = torchvision.models.resnet50(pretrained = pretrained)


    self.layer5a = PyramidPool(2048, 512, 1)
    self.layer5b = PyramidPool(2048, 512, 2)
    self.layer5c = PyramidPool(2048, 512, 3)
    self.layer5d = PyramidPool(2048, 512, 6)




    self.final = nn.Sequential(
    	nn.Conv2d(4096, 512, 3, padding=1, bias=False),
    	nn.BatchNorm2d(512, momentum=.95),
    	nn.ReLU(inplace=True),
    	nn.Dropout(.1),
    	nn.Conv2d(512, num_classes, 1),
    )

    initialize_weights(self.layer5a,self.layer5b,self.layer5c,self.layer5d,self.final)




def forward(self, x):
    count=0

    size=x.size()
    print(size)
    
    x = self.resnet.conv1(x)
    print(x.size())
    x = self.resnet.bn1(x)
    print(x.size())
    x = self.resnet.relu(x)
    print(x.size())
    # x = self.resnet.maxpool(x)
    x = self.resnet.layer1(x)
    print(x.size())
    x = self.resnet.layer2(x)
    print(x.size())
    x = self.resnet.layer3(x)
    print(x.size())
    x = self.resnet.layer4(x)
    print(x.size())
    

    x = self.final(torch.cat([
    	x,
    	self.layer5a(x), #------> Here I am getting the error
    	self.layer5b(x),
     	self.layer5c(x),
     	self.layer5d(x),
    ], 1))
    # print(x.size())

    x = F.upsample_bilinear(x,size[2:])
    print(x.size())

    return x

if name == “main”:
image = torch.randn(1, 3, 572, 572)
model = PSPNet(num_channels=3, num_classes=1)
# model.eval()
print(model(image))

You would have to uncomment the model.eval() operation, so that the internal running stats of all batchnorm layers will be used. In the training mode the batch stats are calculated. Since an intermediate activation has only a single pixel, these stats cannot be calculated and this error is raised.

Thank you for ur reply @ptrblck Sir. still, I got the following error:

RuntimeError Traceback (most recent call last)
in
4 model = PSPNet(num_channels=3, num_classes=1)
5 model.eval()
----> 6 print(model(image))

~/anaconda3/envs/DL_ALL_torch/lib/python3.6/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
→ 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),

in forward(self, x)
110 # self.layer5c(x),
111 # self.layer5d(x),
→ 112 ], 1))
113 # print(x.size())
114

~/anaconda3/envs/DL_ALL_torch/lib/python3.6/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
→ 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),

~/anaconda3/envs/DL_ALL_torch/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
115 def forward(self, input):
116 for module in self:
→ 117 input = module(input)
118 return input
119

~/anaconda3/envs/DL_ALL_torch/lib/python3.6/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
→ 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),

~/anaconda3/envs/DL_ALL_torch/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
421
422 def forward(self, input: Tensor) → Tensor:
→ 423 return self._conv_forward(input, self.weight)
424
425 class Conv3d(_ConvNd):

~/anaconda3/envs/DL_ALL_torch/lib/python3.6/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
418 _pair(0), self.dilation, self.groups)
419 return F.conv2d(input, weight, self.bias, self.stride,
→ 420 self.padding, self.dilation, self.groups)
421
422 def forward(self, input: Tensor) → Tensor:

RuntimeError: Given groups=1, weight of size [512, 4096, 3, 3], expected input[1, 2560, 32, 32] to have 4096 channels, but got 2560 channels instead

The input to this particular conv layer is wrong, as it has 2560 channels while 4096 are expected.
I don’t know which layer throws this error (first one or intermediate), but you should be able to find it by looking at the weight dimension (in_channels=4096, out_channels=512) and make sure the input has the right shape.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier. :wink: