Dataloaders for numpy array image

shivangi · November 10, 2018, 5:34pm

Can someone help me create a custom dataloader where each image and its corresponding segmentation mask is stored as a numpy array?
I have no clue how to create loaders for this segmentation task

ptrblck · November 10, 2018, 7:37pm

Sure!
The important part creating your own Dataset is to get the shapes for the data and target right.
In a simple segmentation use case where each pixel can only belongs to a single class, we would want to use a criterion like nn.CrossEntropyLoss.
Therefore your data should have the input shape of images, i.e. [batch__size, c, h, w].
Our target should contain the class indices for each pixel, thus the channel dimension is missing: [batch_size, h, w], and should be of type torch.long.
I’ve created a dummy example using a Dataset, DataLoader, and a very simple model:

class MyDataset(Dataset):
    def __init__(self, data, target, transform=None):
        self.data = data
        self.target = target
        self.transform = transform
        
    def __getitem__(self, index):
        x = self.data[index]
        y = self.target[index]
        
        if self.transform:
            x = self.transform(x)
            
        return x, y
    
    def __len__(self):
        return len(self.data)
    

class MyModel(nn.Module):
    def __init__(self, in_channels, nb_classes):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, 16, 3, 1, 1)
        self.conv2 = nn.Conv2d(16, nb_classes, 3, 1, 1)
        
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = self.conv2(x)
        return x


device = 'cuda' if torch.cuda.is_available() else 'cpu'
nb_samples = 10
c, h, w = 3, 24, 24
nb_classes = 5
data_arr = np.random.randint(0, 255, (nb_samples, h, w, c), dtype=np.uint8)
target_arr = torch.from_numpy(np.random.randint(0, nb_classes, (nb_samples, h, w)))

transform = transforms.ToTensor()
dataset = MyDataset(data_arr, target_arr, transform)
loader = DataLoader(
    dataset,
    batch_size=2,
    num_workers=1,
    shuffle=True,
    pin_memory=torch.cuda.is_available()
)

model = MyModel(c, nb_classes).to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

# Training routine
for epoch in range(10):
    for batch_idx, (data, target) in enumerate(loader):
        data = data.to(device)
        target = target.to(device)

        optimizer.zero_grad()
    
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        
        print('Epoch {}, batch {}, loss {}'.format(
            epoch, batch_idx, loss.cpu().item()))

You can skip some of the transformations, if your numpy image array is already in np.float32 type and just might call torch.from_numpy directly.
Let me know, if you can adapt this code to your use case.