Thank you so much! I was stuck at this error for so long.
Hi everyone, I’m new to pytorch and I have the same problem because I have rectangular images of different sizes.
So if I understood well the DataLoader does not natively support images of different shapes, is that correct?
If so, is there a workaround to handle such cases?
Thanks in advance
Yes, the default collate_fn
used in the DataLoader
tries to torch.stack
the inputs and will fail, if the samples have a different shape. A fix would be to write a custom collate_fn
and return the samples in e.g. a list
. Note that while this would fix the creation of the batch in the DataLoader
, your model would most likely not be able to use the list
as an input and you would then need to pass each sample separately, resize it etc.
Thanks for this precise response. This worked
I’ve got the same error. I solved by doing the following changes in my script:
-
Use a tuple in the Resize as follow:
transforms.Resize((img_size, img_size))
-
Defined a new collate_fn function (by following the solution of @ptrblck )
def collate_fn(batch): batch = list(filter(lambda x: x is not None, batch)) return torch.utils.data.dataloader.default_collate(batch)
-
And add it to my DataLoader
train_loader = torch.utils.data.DataLoader(dataset=train_transform_data,batch_size=batch_size,shuffle=True, collate_fn=collate_fn)
valid_loader = torch.utils.data.DataLoader(dataset=valid_transform_data,batch_size=batch_size,shuffle=True, collate_fn=collate_fn)
Thank you
Can we try using batch_size = 1 ? By using batch_size=1, PyTorch dataloader does’nt stack multiple samples and hence no need to collate. And it works. I was able to remove the error by using batch size as 1. My doubt is what will happen if we update the weights after every 16 samples of batch_size=1? Like for dataloader, the batch_size is 1 but the weights are updated and optimized after every 16 samples. So my effective batch size in reality will be 16 only. Any consequences of using this approach? Can i add losses for each sample and then backpropagate them once 16 samples are evaluated?
Yes, this approach is known as gradient accumulation and could work.
The issue I’m seeing is that batchsize-dependent layers, such as batchnorm layers, might update the running stats with noisy estimates coming from a single sample (or they might even crash if not enough values are available to calculate the stats).
Besides that, gradient accumulation is a valid approach and should yield the same results (again assuming that all layers have the same “behavior”).
Thanks @ptrblck, I tried all the possible ways to solve this issue but didn’t get success. It gets solved with your advice.
Hello, I tried to Resize the image. I did Resize((size, size)) but it did not work for me.
transforms.Compose([transforms.ToPILImage(),transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])])
Is get the same issue
stack expects each tensor to be equal size, but got [60, 80, 3] at entry 0 and [107, 80, 3] at entry 1
Based on your error message it seems you are not applying the transformation, so try to narrow down where the error is raised and why the transformation wasn’t used.
Thank you very much. It worked. Turns out I had an error in the Dataset Splitting class and transformation was not applied.
max_epochs = 600
val_interval = 2
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = []
metric_values = []
post_pred = Compose([AsDiscrete(argmax=True, to_onehot=3)])
post_label = Compose([AsDiscrete(to_onehot=2)])
for epoch in range(max_epochs):
print(“-” * 10)
print(f"epoch {epoch + 1}/{max_epochs}“)
model.train()
epoch_loss = 0
step = 0
for batch_data in train_loader:
step += 1
inputs, labels = (
batch_data[“image”].to(device),
batch_data[“label”].to(device),
)
optimizer.zero_grad()
outputs = model(inputs)
loss = loss_function(outputs, labels)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
print(
f”{step}/{len(train_ds) // train_loader.batch_size}, "
f"train_loss: {loss.item():.4f}“)
epoch_loss /= step
epoch_loss_values.append(epoch_loss)
print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f}”)
if (epoch + 1) % val_interval == 0:
model.eval()
with torch.no_grad():
for val_data in val_loader:
val_inputs, val_labels = (
val_data["image"].to(device),
val_data["label"].to(device),
)
roi_size = (160, 160, 160)
sw_batch_size = 4
val_outputs = sliding_window_inference(
val_inputs, roi_size, sw_batch_size, model)
val_outputs = [post_pred(i) for i in decollate_batch(val_outputs)]
val_labels = [post_label(i) for i in decollate_batch(val_labels)]
# compute metric for current iteration
dice_metric(y_pred=val_outputs, y=val_labels)
# aggregate the final mean dice result
metric = dice_metric.aggregate().item()
# reset the status for next validation round
dice_metric.reset()
metric_values.append(metric)
if metric > best_metric:
best_metric = metric
best_metric_epoch = epoch + 1
torch.save(model.state_dict(), os.path.join(
root_dir, "best_metric_model.pth"))
print("saved new best metric model")
print(
f"current epoch: {epoch + 1} current mean dice: {metric:.4f}"
f"\nbest mean dice: {best_metric:.4f} "
f"at epoch: {best_metric_epoch}"
)
Hey. I have tried to solved my problem using your solution. Now it returns the following error. Can you help?
TypeError: list indices must be integers or slices, not str
I got the same error, but in my case it’s important to keep the original image size. I mean I want to build NN that get and return different images with different size.
Where I can find an example for such NN?
I have followed everything said under this topic but i am still getting this error RuntimeError: stack expects each tensor to be equal size, but got [3, 736, 353] at entry 0 and [3, 729, 350] at entry 1
here is my code snippet
def getitem(self, index):
image_path = self.im_files[index].rstrip()
im_name_splits = image_path.split(os.sep)[-1].split('.')[0].split('_')
img = Image.open(image_path)
resize = F.Resize((385,185), interpolation=InterpolationMode.BILINEAR)
crop = F.CenterCrop((385, 185))
img = crop(resize(img))
img = np.array(img, dtype=np.uint8)
lable_path = os.path.join(self.labels_base, im_name_splits[0] + '_road_' + im_name_splits[1] + '.png')
label = Image.open(lable_path)
label = crop(resize(label))
label = np.array(label, dtype=np.uint8)
if label is None:
raise ValueError(f"Invalid label image: {self.label_paths[image_path]}")
# Convert NumPy arrays to PIL images
img = Image.fromarray(img)
label = Image.fromarray(label)
img = img.convert("RGB")
label = label.convert("RGB")
# Get the dimensions of the label image using the .size attribute
label_size = label.size
print(f"label size: {label_size}")
#convert to binary mask
road_label = np.array([255, 0, 255])
# reshape the road_label array to match the label array
road_label = np.reshape(road_label, (1, 1, 3))
# cast both arrays to the same dtype, such as uint8
road_label = road_label.astype(np.uint8)
# Check if label is a grayscale image
if label.mode == 'L':
label = label
else:
# It's a color image with color channels, apply np.all along axis=2
label = np.array(label)
cond = np.all(label == road_label, axis=2)
label = label * cond[..., np.newaxis]
# convert the label to grayscale
label = np.dot(label[..., :3], [0.2989, 0.5870, 0.1140])
label = np.expand_dims(label, axis=-1)
# Convert the PIL image to a NumPy array
img = np.array(img)
img = normalize(img, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
img = img.transpose((2, 0, 1))
label = label.transpose((2, 0, 1))
img = torch.from_numpy(img)
gt_image = torch.from_numpy(label)
sample = {'image': img, 'label': gt_image}
if self.transform:
sample = resize(sample)
sample = crop(sample)
if self.augmentations:
sample = self.augmentations(sample)
return sample
def __len__(self):
#print(f"The number of labels is : {len(self.im_files)}")
return len(self.im_files)
class KittiDatasetTest(Dataset):
def init(self, rootdir, transform=None):
self.transform = transform
self.rootdir = rootdir
self.images_base = os.path.join(self.root, ‘testing’, ‘image_2’)
self.im_files = recursive_glob(rootdir=self.images_base, suffix=‘.png’)
self.im_files = sorted(self.im_files)
def __getitem__(self, index):
image_path = self.im_files[index].rstrip()
img = Image.open(image_path)
resize = F.Resize((385,185), interpolation=InterpolationMode.BILINEAR)
crop = F.CenterCrop((385,185))
img = crop(resize(img))
img = np.array(img, dtype=np.uint8)
img = normalize(np.array(img), mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
img = img.transpose((2, 0, 1))
img = torch.from_numpy(img)
sample = {'image': img}
if self.transform:
sample = resize(sample)
sample = crop(sample)
return sample
def __len__(self):
for path, dirs, files in os.walk(self.im_files):
n = len(files)
return n
break
if tensor_1 is [3,736,353] change the tensor [3,729,350] to be in the format [3,353,350].looks like you’re working with images. you have to resize the images to that format. That should work,
I got this error when there getitems, i checked it and found that one of the loss label was “tensor(,dtype=torch.float64)” ,that really beats me!