# CNN on 2d image with one channel

Similar questions have been asked but I have not been able to solve the problem. I am trying to create a CNN over an image with one channel. I keep getting variations of a dimension error. Input Image is of size (99 * 99). Batch size 4.

Shape of input is (4 *99 * 99). When I tried to pass this into a CNN I obviously got an error because I was telling it there was only one channel but instead there are 99 l. So I did unsqueeze_(1) to get the shape of (4 * 1 * 99 * 99) which seemed correct. However I get this error
RuntimeError: Expected 3-dimensional tensor, but got 4-dimensional tensor for argument #1 ‘self’ (while checking arguments for max_pool1d)

Which I am not sure how to solve. I am using a nn.Conv2d layer as my first layer.

The `unsqueeze` was necessary, as your channel dimension was missing.
Since you are using an image, you should also use `nn.MaxPool2d` instead of `nn.MaxPool1d`.
Here is a small example:

``````model = nn.Sequential(
nn.MaxPool2d(2)
)
x = torch.randn(4, 1, 99, 99)
output = model(x)
print(output.shape)
> torch.Size([4, 3, 49, 49])
``````
1 Like

Thank you, works perfectly! A side question, by any chance do you know any resources that would help me understand what cnn architecture to use and kernel size? I am new to the space and a lot of blogs and textbooks I have read recommend searching for a similar model and copying that architecture. I am using a cnn in non exactly conventional way so I can’t find anything similar.

It depends a bit on your image stats etc.
Usually a kernel size of `3` works quite good, as a lot of models use it (see vgg etc.), for a “natural” image of approx. `224x224`.
If you have a medical image (e.g. MRI) in a high resolution, the kernel size might not be the best.

A good resource is Stanford’s CS231n to get a feeling on convolutions.

1 Like

hi,
im having a similar problam with avgpool1d:

this is code:
self.bn1 = nn.BatchNorm2d(64)
self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
if (self.extraClasses == 1 or extraLayer == 1) and extraFeat == False:
self.linear = nn.Linear(512 * block.expansion, num_classes)
if self.pool:
self.scores = nn.AvgPool1d(2, stride=2)

def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
cnnOut = out
layer1 = self.layer1(out)
layer2 = self.layer2(layer1)
layer3 = self.layer3(layer2)
layer4 = self.layer4(layer3)

``````    out = F.avg_pool2d(layer4, 4)
out = out.view(out.size(0), -1)
featureLayer = out
pool = 0
if self.extraClasses == 1:
out = self.linear(out)
if self.pool:
pool = self.scores(out)
``````

in the attempt of calculating the pooling, i get the following:

RuntimeError: Expected 3-dimensional tensor, but got 2-dimensional tensor for argument #1 ‘self’ (while checking arguments for avg_pool1d)

thanks!

`nn.AvgPool1d` expects a 3D input tensor in the shape `[batch_size, seq_len, features]` while you are trying to use a 2D tensor. Could you explain your use case a bit more and what result would be expected?

i want to take each 2 neurons at the end, and sum it up/ averge between them in the last layer.
meaning that if i had 20 neurons in almost final layer, ill have 10 neurons in the last layer

Assuming “between them” means neighboring features, a `view` and `mean` operation should work:

``````batch_size = 10
features = 20
x = torch.arange(batch_size*features).view(batch_size, features).float()

x = x.view(batch_size, -1, 2)
y = x.mean(dim=2)
print(y)
``````