Error: Expected more than 1 value per channel when training

ptrblck · June 1, 2020, 8:13am

You could try to install the nightly binaries, if you are stuck building from source.
Select Preview (Nightly) here to get the install command.

Alpha · June 1, 2020, 9:04am

With the pytorch-nightly version, there is no error. Thanks very much.

So, Each device would NOT need more than a single value? It is fine when the total_batch_size is more than one.

exowanderer · June 9, 2020, 1:22pm

That was verbatim my issue! Thank you!

kuonlp · June 16, 2020, 2:23pm

I have the same problem, and, as mentioned before, it also happens when I try to run DeepLabv3 (my own implementation). It also occurs at exactly the same place, right before the interpolation.

If batch size > 1, the problem is solved, but I don’t like this solution since in medical imaging segmentation batch size = 1 is normal. Also, as mentioned before, doesn’t it use the running sum/std? In addition, if I remove the BN layer right before the interpolation, the problem is also solved, which is super weird because there are BN layers everywhere else in the code.

Edit: I see now where the error comes from.
It appears in _verify_batch_size function which is called at training time. The function definition:

def _verify_batch_size(size):
    size_prods = size[0]
    for i in range(len(size) - 2):
        size_prods *= size[i + 2] 
    if size_prods == 1:
        raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))

After doing Avg pooling in DeepLabv3, the size of your feature maps with batch size = 1 is: 1x256x1x1. This function calculates the size prods by multiplying every element of the feature_maps.shape except the channel; therefore, size_prods = 1 * 1 * 1 = 1, and it raises the Exception. This explains why batch size = 1 has always worked for me, and why everybody has the same issue in this particular line of code.

The exception sort of makes sense: it’s probably not a good idea to calculate any statistics when you only have a single value per feature map. Am I wrong? The easiest solution (at least in case of DeepLabv3) is to remove this particular BatchNorm layer.

mikey_t · July 22, 2020, 5:00pm

I had a similar issue when my batch_size=2. I’m not sure why, but this was leading me to get the error message ... got input size torch.Size([1, 256, 1, 1]), even though in my dataLoader I had drop_last=True and num_workers=0 (and my dataset size was divisible by batch size with no remainder). I increased my batch_size=4 and now I have no problems running on multiple GPUs. I was only using the small batch size because I thought it would help with memory issues, but it looks like it was my downfall.

Ganga · September 3, 2020, 2:07pm

I have set drop_last = true in my val_loader,but i am getting same error
val_loader = torch.utils.data.DataLoader(

                datasets.ImageFolder(valdir, transforms.Compose([

                    transforms.Resize((256, 256)),

                    transforms.CenterCrop(input_size),

                    transforms.ToTensor(),

                    normalize,

                    ])),

                batch_size=args.batch_size, shuffle=False,

                num_workers=args.workers, pin_memory=True,drop_last = True)```

ptrblck · September 3, 2020, 5:51pm

Are you seeing the error for a specific batch only or right from the beginning?
If from the beginning during the validation loop, did you call model.eval() and is your batch size set to 1?

Ganga · September 3, 2020, 6:24pm

Basically I want to print the each layers output which is wrapped in sequential module , So i was trying this code to get required layer output
x = someinput
for l in model.features.modules():
x = l(x)
when I was evaluating(single Image) it I didnt get any error
Yeah ,I have set batch size to 1

ptrblck · September 3, 2020, 7:24pm

Were you using model.eval() in both cases? If so, could you post a code snippet to reproduce this issue?

Ganga · September 4, 2020, 3:49am

Initially I wanted to print the output of Bnorm1,by feeding input ,weight and bias from the file,there I didn’t give model.eval() i just runed it
But in my github repo when i tried to get sequential layer outputs ,in that code they are using model.eval()
But in both the cases I am getting same error
https://github.com/jiecaoyu/XNOR-Net-PyTorch/blob/master/ImageNet/networks/main.py

ptrblck · September 4, 2020, 4:58am

Could you post a minimal code snippet to reproduce this issue using random data, please?

Ganga · September 4, 2020, 6:30am

this is the code i used to generate the error


import torch.nn as nn

import csv

import numpy

import numpy as np

class PrintLayer(nn.Module):

    def __init__(self):

        super(PrintLayer, self).__init__()

        m = nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
            

    def forward(self, x):

        # Do your print / debug stuff here

        #print("x is",x)

        return x

def init_weights(m):

      if type(m) == nn.Linear:

        torch.nn.init.xavier_uniform(m.weight)

        #for weight

        a_file1 = open("lastweight.txt","r")

        list_of_lists1 = []

        for line in a_file1:

          stripped_line1 = line.strip()

          line_list1 = stripped_line1.split()

          list_of_lists1.extend(line_list1)

        a_file1.close()

        b = torch.from_numpy(numpy.array(list_of_lists1,dtype= 'float32'))

        w = torch.tensor(b)

        ten = w.reshape(1000,4096)

        m.weight.data = ten

        #for bias

        a_file12 = open("last bias.txt","r")

        list_of_lists12 = []

        for line in a_file12:

          stripped_line12 = line.strip()

          line_list12 = stripped_line12.split()

          list_of_lists12.extend(line_list12)

        a_file12.close()

        print("bias")

        c = torch.from_numpy(numpy.array(list_of_lists12,dtype= 'float32'))

        d = torch.tensor(c)

        ten1 = d.reshape(1000)

        m.bias.data = ten1

       

        #print("actual weight is",m.weight)

        #print(m.bias)

      

model = nn.Sequential(

        #nn.Dropout(),

        nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),

        PrintLayer(),

         # Add Print layer for debug

        )

model.apply(init_weights)

#for input

a_file = open("lin_output.txt","r")

list_of_lists = []

for line in a_file:

  stripped_line = line.strip()

  line_list = stripped_line.split()

  list_of_lists.extend(line_list)

a_file.close()

b = torch.from_numpy(numpy.array(list_of_lists,dtype= 'float32'))

x = torch.tensor(b)

x = x.reshape(1,4096)

print(x.dtype)

output = model(x)

print(output.shape)

np.set_printoptions(threshold=50000000,formatter={'float_kind':'{:f}'.format})

print(type(output))

values,indices = torch.max(output,1)

print(indices)

print(values)

#output = torch.from_numpy(output)

output = output.detach().numpy()

#print(max(output))

#

#

#print(output)

"""file = open("linear_output.txt","w")

file.write(str(output))

file.close()"""

np.savetxt("last_layer_output_1.txt",output,fmt='%.6f')

#print(model)

#for weight and bias

#print(list(model.named_parameters()))

#print(model.state_dict())```

Ganga · September 4, 2020, 6:35am

but when i try to give randomn inputs,weight,bias i am getting same error error,here is the code

import torch.nn as nn
import csv
import numpy
import numpy as np
class PrintLayer(nn.Module):
    def __init__(self):
        super(PrintLayer, self).__init__()
       
        
    
    def forward(self, x):
        # Do your print / debug stuff here
        #print("x is",x)
        return x


      
model = nn.Sequential(
        #nn.Dropout(),
       nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
        PrintLayer(),
         # Add Print layer for debug
        )


x = torch.randn(1,4096)
output = model(x)
#print(model)
#for weight and bias
print(list(model.named_parameters()))
#print(model.state_dict())```

ptrblck · September 4, 2020, 9:16am

Thanks for the code.
Since you are not calling model.eval(), it’s expected that this error is raised in your code snippet, as explained before.
If you add the model.eval() call before feeding the input to the model, it’ll work.

ptrblck · September 4, 2020, 9:42am

I don’t understand the statement. What would be dropped in which use case? Your quote is unfortunately from another user.

If you want to skip the training for batches containing single samples, this is a valid approach.
Sometimes the DataLoader can yield a smaller batch size as the last batch, which might create this issue. In that case, you could pass drop_last=True to remove this smaller batch.

vraimonds · September 4, 2020, 12:33pm

I’m sorry for my confusing reply and thank you for your fast reply. After reading your reply the second time, I realized that you set the drop_last=True ONLY if the training data set is not dividable by the batch size. Otherwise it still trains on the last batch.

mikey_t · September 22, 2020, 10:36am

When I try to train a model using 4 GPUs I get the error

line 1902, in _verify_batch_size
raise ValueError(‘Expected more than 1 value per channel when training, got input size {}’.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

But the same code doesn’t have any problems when I use 2 GPUs. I’m using train_dl = DataLoader(train_ds, batch_size=10, shuffle=True, drop_last=True) for my loader. Is there a way I can spread this on 4 GPUs without causing the error? I’m using deeplabV3 for my model if this helps. Thank you

ptrblck · September 23, 2020, 8:54am

Based on the setup no GPU should get a single sample.
Could you add a print statement before the failing operation and check the shape of the input tensor for the 2 and 4 GPU runs?

adwaykanhere · October 30, 2020, 10:26am

I’m getting the same error when I’m evaluating my medical image segmentation model.
I’ve used model.eval() and torch.no_grad() calls, but it still throws this error

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 128, 1, 1, 1])

The model I’m using is a 3D Unet architecture with inputs of size (16,16,16), the model can be found here. As I’ve used model.eval() before, I think I should not be getting the above error.

I’m using Niftynet to load the MRI images and that takes cares of the batch loading for the model.

I cannot share my actual code, however, it is very similar to what is done in this link

Please highlight where I’m making a mistake. I’m still learning Pytorch.

ptrblck · October 31, 2020, 2:30am

The linked model works fine with a single input sample:

model = Modified3DUNet(1, 2).eval()
x = torch.randn(1, 1, 48, 48, 48)
out = model(x)

I cannot find any modifications to the model in the second link, which could add e.g. batchnorm layers.

Since you cannot share the code I would recommend to isolate the batchnorm layer, which is raising this error, and check it’s .training attribute.
If it’s not set to False after calling model.eval() it might not be registered properly as a module.