You could try to install the nightly binaries, if you are stuck building from source.
Select Preview (Nightly)
here to get the install command.
With the pytorch-nightly
version, there is no error. Thanks very much.
So, Each device would NOT need more than a single value? It is fine when the total_batch_size
is more than one.
That was verbatim my issue! Thank you!
I have the same problem, and, as mentioned before, it also happens when I try to run DeepLabv3 (my own implementation). It also occurs at exactly the same place, right before the interpolation.
If batch size > 1, the problem is solved, but I don’t like this solution since in medical imaging segmentation batch size = 1 is normal. Also, as mentioned before, doesn’t it use the running sum/std? In addition, if I remove the BN layer right before the interpolation, the problem is also solved, which is super weird because there are BN layers everywhere else in the code.
Edit: I see now where the error comes from.
It appears in _verify_batch_size function which is called at training time. The function definition:
def _verify_batch_size(size):
size_prods = size[0]
for i in range(len(size) - 2):
size_prods *= size[i + 2]
if size_prods == 1:
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
After doing Avg pooling in DeepLabv3, the size of your feature maps with batch size = 1 is: 1x256x1x1. This function calculates the size prods by multiplying every element of the feature_maps.shape except the channel; therefore, size_prods = 1 * 1 * 1 = 1, and it raises the Exception. This explains why batch size = 1 has always worked for me, and why everybody has the same issue in this particular line of code.
The exception sort of makes sense: it’s probably not a good idea to calculate any statistics when you only have a single value per feature map. Am I wrong? The easiest solution (at least in case of DeepLabv3) is to remove this particular BatchNorm layer.
I had a similar issue when my batch_size=2
. I’m not sure why, but this was leading me to get the error message ... got input size torch.Size([1, 256, 1, 1])
, even though in my dataLoader I had drop_last=True
and num_workers=0
(and my dataset size was divisible by batch size with no remainder). I increased my batch_size=4
and now I have no problems running on multiple GPUs. I was only using the small batch size because I thought it would help with memory issues, but it looks like it was my downfall.
I have set drop_last = true in my val_loader,but i am getting same error
val_loader = torch.utils.data.DataLoader(
datasets.ImageFolder(valdir, transforms.Compose([
transforms.Resize((256, 256)),
transforms.CenterCrop(input_size),
transforms.ToTensor(),
normalize,
])),
batch_size=args.batch_size, shuffle=False,
num_workers=args.workers, pin_memory=True,drop_last = True)```
Are you seeing the error for a specific batch only or right from the beginning?
If from the beginning during the validation loop, did you call model.eval()
and is your batch size set to 1?
-
Basically I want to print the each layers output which is wrapped in sequential module , So i was trying this code to get required layer output
x = someinput
for l in model.features.modules():
x = l(x) -
when I was evaluating(single Image) it I didnt get any error
-
Yeah ,I have set batch size to 1
Were you using model.eval()
in both cases? If so, could you post a code snippet to reproduce this issue?
-
Initially I wanted to print the output of Bnorm1,by feeding input ,weight and bias from the file,there I didn’t give model.eval() i just runed it
-
But in my github repo when i tried to get sequential layer outputs ,in that code they are using model.eval()
-
But in both the cases I am getting same error
https://github.com/jiecaoyu/XNOR-Net-PyTorch/blob/master/ImageNet/networks/main.py
Could you post a minimal code snippet to reproduce this issue using random data, please?
this is the code i used to generate the error
import torch.nn as nn
import csv
import numpy
import numpy as np
class PrintLayer(nn.Module):
def __init__(self):
super(PrintLayer, self).__init__()
m = nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
def forward(self, x):
# Do your print / debug stuff here
#print("x is",x)
return x
def init_weights(m):
if type(m) == nn.Linear:
torch.nn.init.xavier_uniform(m.weight)
#for weight
a_file1 = open("lastweight.txt","r")
list_of_lists1 = []
for line in a_file1:
stripped_line1 = line.strip()
line_list1 = stripped_line1.split()
list_of_lists1.extend(line_list1)
a_file1.close()
b = torch.from_numpy(numpy.array(list_of_lists1,dtype= 'float32'))
w = torch.tensor(b)
ten = w.reshape(1000,4096)
m.weight.data = ten
#for bias
a_file12 = open("last bias.txt","r")
list_of_lists12 = []
for line in a_file12:
stripped_line12 = line.strip()
line_list12 = stripped_line12.split()
list_of_lists12.extend(line_list12)
a_file12.close()
print("bias")
c = torch.from_numpy(numpy.array(list_of_lists12,dtype= 'float32'))
d = torch.tensor(c)
ten1 = d.reshape(1000)
m.bias.data = ten1
#print("actual weight is",m.weight)
#print(m.bias)
model = nn.Sequential(
#nn.Dropout(),
nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
PrintLayer(),
# Add Print layer for debug
)
model.apply(init_weights)
#for input
a_file = open("lin_output.txt","r")
list_of_lists = []
for line in a_file:
stripped_line = line.strip()
line_list = stripped_line.split()
list_of_lists.extend(line_list)
a_file.close()
b = torch.from_numpy(numpy.array(list_of_lists,dtype= 'float32'))
x = torch.tensor(b)
x = x.reshape(1,4096)
print(x.dtype)
output = model(x)
print(output.shape)
np.set_printoptions(threshold=50000000,formatter={'float_kind':'{:f}'.format})
print(type(output))
values,indices = torch.max(output,1)
print(indices)
print(values)
#output = torch.from_numpy(output)
output = output.detach().numpy()
#print(max(output))
#
#
#print(output)
"""file = open("linear_output.txt","w")
file.write(str(output))
file.close()"""
np.savetxt("last_layer_output_1.txt",output,fmt='%.6f')
#print(model)
#for weight and bias
#print(list(model.named_parameters()))
#print(model.state_dict())```
but when i try to give randomn inputs,weight,bias i am getting same error error,here is the code
import torch.nn as nn
import csv
import numpy
import numpy as np
class PrintLayer(nn.Module):
def __init__(self):
super(PrintLayer, self).__init__()
def forward(self, x):
# Do your print / debug stuff here
#print("x is",x)
return x
model = nn.Sequential(
#nn.Dropout(),
nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
PrintLayer(),
# Add Print layer for debug
)
x = torch.randn(1,4096)
output = model(x)
#print(model)
#for weight and bias
print(list(model.named_parameters()))
#print(model.state_dict())```
Thanks for the code.
Since you are not calling model.eval()
, it’s expected that this error is raised in your code snippet, as explained before.
If you add the model.eval()
call before feeding the input to the model, it’ll work.
I don’t understand the statement. What would be dropped in which use case? Your quote is unfortunately from another user.
If you want to skip the training for batches containing single samples, this is a valid approach.
Sometimes the DataLoader
can yield a smaller batch size as the last batch, which might create this issue. In that case, you could pass drop_last=True
to remove this smaller batch.
I’m sorry for my confusing reply and thank you for your fast reply. After reading your reply the second time, I realized that you set the drop_last=True
ONLY if the training data set is not dividable by the batch size. Otherwise it still trains on the last batch.
When I try to train a model using 4 GPUs I get the error
line 1902, in _verify_batch_size
raise ValueError(‘Expected more than 1 value per channel when training, got input size {}’.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])
But the same code doesn’t have any problems when I use 2 GPUs. I’m using train_dl = DataLoader(train_ds, batch_size=10, shuffle=True, drop_last=True)
for my loader. Is there a way I can spread this on 4 GPUs without causing the error? I’m using deeplabV3 for my model if this helps. Thank you
Based on the setup no GPU should get a single sample.
Could you add a print statement before the failing operation and check the shape of the input tensor for the 2 and 4 GPU runs?
I’m getting the same error when I’m evaluating my medical image segmentation model.
I’ve used model.eval()
and torch.no_grad(
) calls, but it still throws this error
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 128, 1, 1, 1])
The model I’m using is a 3D Unet architecture with inputs of size (16,16,16), the model can be found here. As I’ve used model.eval()
before, I think I should not be getting the above error.
I’m using Niftynet to load the MRI images and that takes cares of the batch loading for the model.
I cannot share my actual code, however, it is very similar to what is done in this link
Please highlight where I’m making a mistake. I’m still learning Pytorch.
The linked model works fine with a single input sample:
model = Modified3DUNet(1, 2).eval()
x = torch.randn(1, 1, 48, 48, 48)
out = model(x)
I cannot find any modifications to the model in the second link, which could add e.g. batchnorm layers.
Since you cannot share the code I would recommend to isolate the batchnorm layer, which is raising this error, and check it’s .training
attribute.
If it’s not set to False
after calling model.eval()
it might not be registered properly as a module.