Torch Dataloader gives torch.Tensor as ouput

ptrblck · December 2, 2018, 3:25pm

I assume the shape you’ve printed is the shape of the batch from the DataLoader.
If that’s the case your MR and CT data have the shape [200, 200] in __getitem__.

Based on your description it seems you would like to apply the same transformation on the whole batch instead of each single image.

If that’s the case one approach would be to create an own sampler and provide a list of random indices to your Dataset.
In the __getitem__ you would get a list of indices of the length batch_size, could load all images one by one and apply the same transformation on them.
In the training loop you would get an additional batch dimension and can just squeeze it.
Here is a small example:

class MyDataset(Dataset):
    def __init__(self, data):
        self.data = data
        
    def __getitem__(self, index):
        x = torch.stack([self.data[i] for i in index])
        return x
    
    def __len__(self):
        return len(self.data)


class RandomBatchSampler(torch.utils.data.sampler.Sampler):
    def __init__(self, data_source, batch_size):
        self.data_source = data_source
        self.batch_size = batch_size

    def __iter__(self):
        rand_idx = torch.randperm(len(self.data_source)).tolist()
        data_iter = iter([rand_idx[i:i+self.batch_size] for i in range(0, len(rand_idx), self.batch_size)])
        return data_iter

    def __len__(self):
        return len(self.data_source)//self.batch_size


data = torch.randn(100, 3, 24, 24)
dataset = MyDataset(data)

batch_size = 64
sampler = RandomBatchSampler(data, batch_size=batch_size)

loader = DataLoader(
    dataset,
    batch_size=1,
    num_workers=2,
    sampler=sampler
)

for x in loader:
    x.squeeze_(0)
    print(x.shape)