Monai 5D output from dataloader not working for non monai models

Hi,

Was delving into @MONAI, but I’m having some issues regarding the dimensions of the outputs from the dataloader.

So it seems that when loading nifti data, the Dataloader always returns 5D input tensors, which does work fine for the models provided by monai, but doesn’t work for native torch models such as the Unet by milesial or other similar.

Is there any workaround to be able to use this data loader with custom models? I know the nifti files are 3D images, is that the problem?

In the end I leave a snippet of my code to demonstrate the problem:

First time posting, sorry for any possible mistakes!

tempdir = './prostate/'
     
images = sorted(glob(os.path.join(tempdir, "images", "*T2.nii.gz")))
segs = sorted(glob(os.path.join(tempdir, "masks", "*t2.nii.gz")))

data_dicts = [
    {"image": image_name, "label": label_name}
    for image_name, label_name in zip(images, segs)
]
train_files, val_files = data_dicts[:-3], data_dicts[8:]

train_transforms = Compose(
    [
        LoadImaged(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        Resized(keys=["image", "label"], spatial_size=(128, 128, 32), mode=('trilinear', 'nearest')),
        ToTensord(keys=["image", "label"]),
    ]
)

val_transforms = Compose(
    [
        LoadImaged(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        Resized(keys=["image", "label"], spatial_size=(128, 128, 32), mode=('trilinear', 'nearest')),
        ToTensord(keys=["image", "label"]),
    ]
)

train_ds = monai.data.Dataset(data=train_files, transform=train_transforms)
train_loader = DataLoader(train_ds, batch_size=2, shuffle=True, num_workers=4)

val_ds = Dataset(data=val_files, transform=val_transforms)
val_loader = DataLoader(val_ds, batch_size=1, num_workers=4)


model = UNet(1,2).to(device) # Unet from milesial repo, 1 channel input 2 channel output
loss_function = monai.losses.DiceLoss(sigmoid=True)
optimizer = torch.optim.Adam(model.parameters(), 1e-4)

epoch_num = 5
val_interval = 2
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = list()
metric_values = list()

for epoch in range(epoch_num):
    print("-" * 10)
    print(f"epoch {epoch + 1}/{epoch_num}")
    model.train()
    epoch_loss = 0
    step = 0
    for batch_data in train_loader:
        step += 1
        inputs, labels = (batch_data["image"].to(device), batch_data["label"].to(device),)
        
        print(inputs.shape, '\n')

        optimizer.zero_grad()
        outputs = model(inputs)
        loss = loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
        print(f"{step}/{len(train_ds) // train_loader.batch_size}, train_loss: {loss.item():.4f}")
    epoch_loss /= step
    epoch_loss_values.append(epoch_loss)
    print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f}")

Output from the print: torch.Size([2, 1, 128, 128, 32])

Error: Expected 4-dimensional input for 4-dimensional weight [64, 3, 3, 3], but got 5-dimensional input of size [2, 1, 128, 128, 32] instead

1 Like

Well, the Unet network you have referred handles 4D arrays only (batch, channel, height, width). You need to find a UNet model that handles 5D array(batch, channel, height, width, Depth) (you will find conv3d layers, etc.)

1 Like

First of all thank you very much, switching all operations to 3D does allow for the use of 5D arrays and solves the problems.

As a followup, still regarding the dataloader output, is it possible to load the nifti images but treat them not as 1 3D with x Depth, but several 2D images, again, ending with a 4D array instead of 5D?