I’m trying to use the torchvision pascal voc package
from torchvision.datasets import VOCDetection
My problem is that I’m having issues when resizing the images with transforms.
data_transforms = transforms.Compose([ transforms.Resize(size=(256,300)), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]) dataset = VOCDetection(root="./", download=True, transform=data_transforms) train_size = int(len(dataset) * 0.9) val_size = int(len(dataset) - train_size) train, val = random_split(dataset, [train_size, val_size]) train_loader = DataLoader(train, batch_size=32, num_workers=4) val_loader = DataLoader(val, batch_size=32, num_workers=4)
I’m passing the size as a tuple, still I’m getting this when I run:
RuntimeError: each element in list of batch should be of equal size
If I pass something like
transforms.Resize(size=(256)) I get a reasonble error
RuntimeError: stack expects each tensor to be equal size, but got [3, 256, 341] at entry 0 and [3, 341, 256] at entry 1
which makes sense based on transforms.resize documentation. I’m confused because even passing it as a tuple, still I get the error.
I never worked with this detection datasets , so if there’s anything I should know I appreciate the help.