How to get a 5 dimensional input with a 3D array tensor?

I want to test if my computer can train a 3D array with 3D UNET.
I want to test it by using tensor.zeros of shape (64,192,192) like input to my model. The code (uncomplete) is the following :

array_3D = torch.zeros(64, 192, 192)
model = UNet()
loss_function = DiceLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-4)
epochs = 10

for epoch in range(epochs):

batch_size = 1
model.train()

for data in range(1):
    batch_train_source_numpy = array_3D
    batch_train_logits_cuda = model(batch_train_source_numpy)

but then I get this error :

RuntimeError: Expected 5-dimensional input for 5-dimensional weight [16, 1, 3, 3, 3], but got 3-dimensional input of size [64, 192, 192] instead

how can I put a batch size and how many channels should I have to get a 5dimensionnal input ?

Hi nestlee,

Cannot reproduce without seeing how you defined UNet and DiceLoss, but I’m going to guess that the model expects an input shaped like: (N, C, H, W, D) where:

  • N = minibatch size
  • C = number of channels, 3 if image is expected to be RGB, 1 if black-and-white
  • H, W, D = height, width, depth – note that these could be in some other order, look up your model’s documentation to be sure

However, your input passed was just shaped (H, W, D). So you probably need to unsqueeze along the first (minibatch dimension) and add an extra channel. Something like:

array_3D = torch.zeros(3, 64, 192, 192)
...
batch_train_source_numpy = array_3D.view(-1, 3, 64, 192, 192)

Hope this helps!

Hi Andrei_Cristea

I have Black and White images
here is the code for the 3D UNET :

  def double_conv(in_channels, out_channels):
      conv = nn.Sequential(
          nn.Conv3d(in_channels,out_channels, kernel_size=3, padding=1),
          nn.ReLU(inplace=True),
          nn.Conv3d(out_channels, out_channels, kernel_size=3, padding=1),
          nn.ReLU(inplace=True)
      )
      return conv
  
  def crop_image(tensor, target_tensor):
      target_size = target_tensor.size()[2]
      tensor_size = tensor.size()[2]
      delta = tensor_size - target_size
      delta = delta//2
      return tensor[:, :, delta:tensor_size-delta, delta:tensor_size-delta]
  
  class UNet(nn.Module):
      def __init__(self):
          super(UNet, self).__init__()
  
          self.max_pool_2x2_1 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_2 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_3 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_4 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_5 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.down_conv_1 = double_conv(1, 16)
          self.down_conv_2 = double_conv(16, 32)
          self.down_conv_3 = double_conv(32, 64)
          self.down_conv_4 = double_conv(64, 128)
          self.down_conv_5 = double_conv(128, 256)
  
          self.up_conv_trans2 = nn.ConvTranspose3d(in_channels=256, out_channels=128, kernel_size=2, stride=2)
          self.up_conv_2 = double_conv(256, 128)
  
          self.up_conv_trans3 = nn.ConvTranspose3d(in_channels=128, out_channels=64, kernel_size=2, stride=2)
          self.up_conv_3 = double_conv(128, 64)
  
          self.up_conv_trans4 = nn.ConvTranspose3d(in_channels=64, out_channels=32, kernel_size=2, stride=2)
          self.up_conv_4 = double_conv(64, 32)
  
          self.up_conv_trans5 = nn.ConvTranspose3d(in_channels=32, out_channels=16, kernel_size=2, stride=2)
          self.up_conv_5 = double_conv(32, 16)
  
  
          self.out = nn.Conv3d(in_channels=16, out_channels=1, kernel_size=1)
  
  
      def forward(self, image):
          out_conv1 = self.down_conv_1(image)
          out_pool1 = self.max_pool_2x2_1(out_conv1)
          out_conv2 = self.down_conv_2(out_pool1)
          out_pool2 = self.max_pool_2x2_2(out_conv2)
          out_conv3 = self.down_conv_3(out_pool2)
          out_pool3 = self.max_pool_2x2_3(out_conv3)
          out_conv4 = self.down_conv_4(out_pool3)
          out_pool4 = self.max_pool_2x2_4(out_conv4)
          out_conv5 = self.down_conv_5(out_pool4)
  
          #decoder part
          out_up_conv = self.up_conv_trans2(out_conv5)
  
          y = crop_image(out_conv4, out_up_conv)
          out_up_conv = self.up_conv_2(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.up_conv_trans3(out_up_conv)
          y = crop_image(out_conv3, out_up_conv)
          out_up_conv = self.up_conv_3(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.up_conv_trans4(out_up_conv)
          y = crop_image(out_conv2, out_up_conv)
          out_up_conv = self.up_conv_4(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.up_conv_trans5(out_up_conv)
          y = crop_image(out_conv1, out_up_conv)
          out_up_conv = self.up_conv_5(torch.cat([out_up_conv, y], 1))
  
  
          out_up_conv = self.out(out_up_conv)
          return out_up_conv

Hi Andrei_Cristea,

I have black and white images

3D UNET :

  def double_conv(in_channels, out_channels):
      conv = nn.Sequential(
          nn.Conv3d(in_channels,out_channels, kernel_size=3, padding=1),
          nn.ReLU(inplace=True),
          nn.Conv3d(out_channels, out_channels, kernel_size=3, padding=1),
          nn.ReLU(inplace=True)
      )
      return conv
  
  def crop_image(tensor, target_tensor):
      target_size = target_tensor.size()[2]
      tensor_size = tensor.size()[2]
      delta = tensor_size - target_size
      delta = delta//2
      return tensor[:, :, delta:tensor_size-delta, delta:tensor_size-delta]
  
  class UNet(nn.Module):
      def __init__(self):
          super(UNet, self).__init__()
  
          #definition of max pooling :
          self.max_pool_2x2_1 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_2 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_3 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_4 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.max_pool_2x2_5 = nn.MaxPool3d(kernel_size=2, stride=2)
          self.down_conv_1 = double_conv(1, 16)
          self.down_conv_2 = double_conv(16, 32)
          self.down_conv_3 = double_conv(32, 64)
          self.down_conv_4 = double_conv(64, 128)
          self.down_conv_5 = double_conv(128, 256)
          #self.down_conv_6 = double_conv(256, 512)
  
          self.up_conv_trans2 = nn.ConvTranspose3d(in_channels=256, out_channels=128, kernel_size=2, stride=2)
          self.up_conv_2 = double_conv(256, 128)
  
          self.up_conv_trans3 = nn.ConvTranspose3d(in_channels=128, out_channels=64, kernel_size=2, stride=2)
          self.up_conv_3 = double_conv(128, 64)
  
          self.up_conv_trans4 = nn.ConvTranspose3d(in_channels=64, out_channels=32, kernel_size=2, stride=2)
          self.up_conv_4 = double_conv(64, 32)
  
          self.up_conv_trans5 = nn.ConvTranspose3d(in_channels=32, out_channels=16, kernel_size=2, stride=2)
          self.up_conv_5 = double_conv(32, 16)
  
  
          self.out = nn.Conv3d(in_channels=16, out_channels=1, kernel_size=1)
  
  
      def forward(self, image):
          #encoder part
          out_conv1 = self.down_conv_1(image)
          out_pool1 = self.max_pool_2x2_1(out_conv1)
          out_conv2 = self.down_conv_2(out_pool1)
          out_pool2 = self.max_pool_2x2_2(out_conv2)
          out_conv3 = self.down_conv_3(out_pool2)
          out_pool3 = self.max_pool_2x2_3(out_conv3)
          out_conv4 = self.down_conv_4(out_pool3)
          out_pool4 = self.max_pool_2x2_4(out_conv4)
          out_conv5 = self.down_conv_5(out_pool4)
          # out_pool5 = self.max_pool_2x2_5(out_conv5)
          # out_conv6 = self.down_conv_6(out_pool5)
          #print(f"outconv6 = {out_conv6.size()}")
  
          #decoder part
          out_up_conv = self.up_conv_trans2(out_conv5)
  
          y = crop_image(out_conv4, out_up_conv)
          #print(y.size())
          out_up_conv = self.up_conv_2(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.up_conv_trans3(out_up_conv)
          y = crop_image(out_conv3, out_up_conv)
          out_up_conv = self.up_conv_3(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.up_conv_trans4(out_up_conv)
          y = crop_image(out_conv2, out_up_conv)
          out_up_conv = self.up_conv_4(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.up_conv_trans5(out_up_conv)
          y = crop_image(out_conv1, out_up_conv)
          out_up_conv = self.up_conv_5(torch.cat([out_up_conv, y], 1))
  
          out_up_conv = self.out(out_up_conv)
          return out_up_conv

and my dice loss function :

  class DiceLoss(nn.Module):
      def __init__(self, weight=None, size_average=True):
          super(DiceLoss, self).__init__()
  
      def forward(self, inputs, targets):
  
          inputs = torch.sigmoid(inputs)
  
          # flatten label and prediction tensors
          inputs = inputs.view(-1)
          targets = targets.view(-1)
  
          intersection = (inputs * targets).sum()
          dice = (2. * intersection) / (inputs.sum() + targets.sum())
  
          return 1 - dice

I tried your suggestion but I get this error :

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 24 but got size 8 for tensor number 1 in the list.

what sould I do ?
Thank you for your help

Your model expects an input where the lengths of the 3 dimensions are equal. This works for me:

array_3D = torch.zeros(1, 192, 192, 192)
batch_train_source_numpy = array_3D.view(-1, 1, 192, 192, 192)
model = UNet()
print(model(batch_train_source_numpy))
1 Like

Hi Nestlee!

Andrei’s observation that your model expects three equal dimensions
is partially correct, but your crop_image() has additional problems.

crop_image() does crop the first and second dimensions of your
three-dimensional “image” to the same size, and if the two uncropped
dimensions differ, it will crop the second dimension incorrectly.

Furthermore, it doesn’t crop the third dimension of your “image” (which
is the fifth dimension of your 5d input) at all.

Lastly, delta = delta//2 rounds down, so if delta (before dividing) is
only 1, you won’t crop at all, when you should probably crop one pixel
from one side of the “image” and none from the other.

All three of these issues can cause cropping to not work correctly so that
your cropped upstream “image” and downstream image won’t have the
same shape and torch.cat([out_up_conv, y] will fail with the error
message you posted.

Andrei’s test case of array_3D = torch.zeros(1, 192, 192, 192)
worked because every downsampling (MaxPool3d) happens to get fed
a tensor whose dimensions are divisible by 2, so that no actual cropping
ever had to happen.

Try running your model on examples such as:

batch_array_3D = torch.zeros (1, 1, 190, 190, 190)
# or
batch_array_3D = torch.zeros (1, 1, 192, 192, 196)

Best.

K. Frank

1 Like