ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

Dhruv · June 4, 2023, 11:56pm

I have a four gpu setup (24 GB each) where I am trying to train a DeepLabV3Plus model using segmentation_models_pytorch library.

I am facing these errors:
ValueError: Caught ValueError in replica 0 on device 0.
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

The batch size I am using is 8.
Can you please help me resolve this issue?
Thank you!

eqy · June 4, 2023, 11:57pm

It looks like you are passing a grayscale image (single channel) when the model expects multiple channels.

Are you adjusting for this property when creating model e.g., via the in_channels argument?

model = smp.Unet(
...
    in_channels=1,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
...
)

Dhruv · June 4, 2023, 11:59pm

Actually, I am using all RGB images.

Dhruv · June 5, 2023, 12:00am

It works with the default settings. Now there is a pool of encoders among which we can select which one to use. I am experimenting with different encoders. For some it works, for some it doesn’t.

Thank you for replying so quickly!
Really appreciate it.

eqy · June 5, 2023, 12:02am

Right, I misread the original error. It looks like your “encoder” might be downsampling the input too much. Could you check if e.g., downsampling is a setting that is available to you or workaround the issue by increasing the input size so that the spatial dimensions are not reduced to 1x1?

Dhruv · June 5, 2023, 12:05am

These are all the available settings.
And currently I am using everything default except for the encoder_name.

eqy · June 5, 2023, 12:07am

Right, so you might look into:
decreasing encoder_output_stride, increasing upsampling, or increasing the resolution of the input image as possible solutions.

Dhruv · June 5, 2023, 12:08am

Okay. Thank you for the tips. I am gonna try these out, and will update on this thread!
Thank you so much!

ptrblck · June 5, 2023, 2:46am

@eqy already explained why the error might be raised, however it still doesn’t fit your description:

I have a four gpu setup (24 GB each) where I am trying to train a DeepLabV3Plus
got input size torch.Size([1, 256, 1, 1])

I don’t know of you are using data parallel (I would assume so), which should yield a batch size of 2 for each of the 4 GPUs assuming the global batch size is 8. If the local batch size is set to 8 then of course each GPU should get 8 samples while the error indicates a single sample.

Dhruv · June 5, 2023, 3:05am

Yes, I am using DataParallel.

ptrblck · June 5, 2023, 7:41am

Which would mean that each of the four GPUs should process 2 samples for a batch size of 8. Could you add print statements to the forward method and post the shape of the input as well as all activation tensors? I guess you might either use an invalid reshaping operation in the forward or your batch size is not 8.

QingCheng24 · December 5, 2023, 11:41pm

I had the same issue which was solved by passing drop_last = True to the Dataloader.
The issue occurs when the last batch accidentally holds 1 sample.