Pytorch newbie here! Hi all!
I’m trying to create a custom dataset to plug into DataLoader
that is composed of single-channel images (20000 x 1 x 28 x 28), single-channel masks (20000 x 1 x 28 x 28), and three labels (20000 X 3).
Following the documentation, I thought I would test creating a dataset with a single-channel image and a single-channel mask, using the following code:
class CustomDataset(Dataset):
def __init__(self, images, masks, transforms=None, train=True):
self.images = images
self.masks = masks
self.transforms = transforms
def __getitem__(self, index):
image = self.images.iloc[index, :]
image = np.asarray(image).astype(np.float64).reshape(28, 28, 1)
mask = self.masks.iloc[index, :]
mask = np.asarray(mask).astype(np.float64).reshape(28, 28, 1)
transformed_image = self.transforms(image)
return transformed_image, mask
def __len__(self):
return len(self.images)
Using the class, I form the dataset from two pandas dataframes and plug into DataLoader
.
transform = transforms.Compose(
[transforms.ToPILImage(),
transforms.ToTensor(),
transforms.Normalize((0.5, ), (0.5, ))
])
train_images = pd.read_csv('train.csv')
train_masks = pd.read_csv('masks.csv')
train_data = CustomDataset(train_images, train_masks, transform)
trainloader = DataLoader(train_data, batch_size=128, shuffle=True)
I would expect the shape of a single batch in trainloader
to be ([128, 1, 28, 28], [128, 1, 28, 28]), for both the image on the left and the mask on the right.
Instead the shape of a single batch trainloader
is ([128, 1, 28, 28], [128]), which makes me think that the masks have somehow been transformed into labels.
How do I fix this, and how do I add in the actual labels? Thanks in advance for your help!