Issues with dataloader

ahmed · January 24, 2019, 4:52pm

I am trying to use the KAIST dataset for pedestrian detection. I have done the tutorials on the PyTorch website. I am new to machine learning/deep learning and have limited programming experience. I have designed a custom dataset class and dataloader based on the ’ DATA LOADING AND PROCESSING TUTORIAL’ on the PyTorch website. To attempt to get a better understanding, I am using a small amount of images and labels to feed into a neural network.

class dataset(Dataset):

    def __init__(self, csv_file, root_dir, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.csv = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(self.csv)

    def __getitem__(self, idx):
        img_name = os.path.join(self.root_dir, self.csv.iloc[idx, 0])
        image = io.imread(img_name)
        labels = self.csv.iloc[idx, 18]
        labels = labels.astype('float').reshape(-1, 1)
        sample = {'image': image, 'labels': labels}

        if self.transform:
            sample = self.transform(sample)

        return sample

csv_file='train/images/annotations.csv', root_dir='train/images/'

class ToTensor(object):
    """Convert ndarrays in sample to Tensors."""

    def __call__(self, sample):
        image, labels = sample['image'], sample['labels']

        # swap color axis because
        # numpy image: H x W x C
        # torch image: C X H X W
        image = image.transpose((2, 0, 1))
        return {'image': torch.from_numpy(image),
                'labels': torch.from_numpy(labels)}

transformed_dataset = dataset(csv_file='train/images/annotations.csv',
                                           root_dir='train/images/',
                                           transform=transforms.Compose([ToTensor()]))

for i in range(len(transformed_dataset)):
    sample = transformed_dataset[i]

    print(i, sample['image'].size(), sample['labels'].size())

    if i == 3:
        break

However, for some reason the dataloader will not output the batch size.

dataloader = DataLoader(transformed_dataset, batch_size=64)

loader = DataLoader(dataloader, batch_size=64)

for image, labels in dataloader:
    
    print(sample['image'].shape, sample['labels'].shape)

torch.Size([3, 512, 640]) torch.Size([1, 1])
torch.Size([3, 512, 640]) torch.Size([1, 1])
torch.Size([3, 512, 640]) torch.Size([1, 1])
torch.Size([3, 512, 640]) torch.Size([1, 1])
torch.Size([3, 512, 640]) torch.Size([1, 1])
torch.Size([3, 512, 640]) torch.Size([1, 1])
torch.Size([3, 512, 640]) torch.Size([1, 1])

Any advice would be greatly appreciated.

ptrblck · January 24, 2019, 9:49pm

You are still printing the Dataset output.
Try to print the shape of image and labels instead and the batch size should be in dim0.

ahmed · January 26, 2019, 3:59pm

Thank you very much for your reply. I have managed to get it working now.

ahmed · January 26, 2019, 4:36pm

I have updated my code based on another example i found online:

class CustomDatasetFromImages(Dataset):
    def __init__(self, csv_path):
        """
        Args:
            csv_path (string): path to csv file
            img_path (string): path to the folder where images are
            transform: pytorch transforms for transforms and tensor conversion
        """
        # Transforms
        self.to_tensor = transforms.ToTensor()
        # Read the csv file
        self.data_info = pd.read_csv(csv_path)
        # First column contains the image paths
        self.image_arr = np.asarray(self.data_info.iloc[:, 0])
        # Column is the labels
        self.label_arr = np.asarray(self.data_info.iloc[:, 18])
        # Calculate len
        self.data_len = len(self.data_info.index)

    def __getitem__(self, index):
        # Get image name from the pandas df
        single_image_name = self.image_arr[index]
        # Open image
        img_as_img = Image.open(single_image_name)
        # Transform image to tensor
        img_as_tensor = self.to_tensor(img_as_img)
        # Get label(class) of the image based on the cropped pandas column
        single_image_label = self.label_arr[index]

        return (img_as_tensor, single_image_label)

    def __len__(self):
        return self.data_len

if __name__ == "__main__":
    # Call dataset
    custom_mnist_from_images =  \
        CustomDatasetFromImages('annotations.csv')

    # Define data loader
    mn_dataset_loader = torch.utils.data.DataLoader(dataset=custom_mnist_from_images,
                                                    batch_size=4,
                                                    shuffle=False)

This is providing me the output i was expecting from the dataloader. However, I am not getting the expected results when i run it through my CNN network.

import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 125 * 157, 157) 
        #print(self.fc1) 
        self.fc2 = nn.Linear(157, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        #print(x.size()) 
        x = self.pool(F.relu(self.conv1(x)))
        #print(x.size()) 
        x = self.pool(F.relu(self.conv2(x)))
        #print(x.size())
        x = x.view(x.size(0), -1)
        print(x.size()) # torch.Size([50240, 400])
        x = F.relu(self.fc1(x))
        print(x.size()) # torch.Size([50240, 120])
        x = F.relu(self.fc2(x))
        print(x.size()) # torch.Size([50240, 84])
        x = self.fc3(x) 
        #print(x.size()) # torch.Size([50240, 10])
        return x


net = Net()

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(2):

    running_loss = 0.0
    for i, data in enumerate(mn_dataset_loader, 0):
        images, labels = data
        #print(images.shape, labels.shape)
        images = Variable(images)
        labels = Variable(labels)
        # Clear gradients
        optimizer.zero_grad()
        #print(images.shape, labels.shape)
        # Forward pass
        outputs = net(images)
        # Calculate loss
        loss = criterion(outputs, labels)
        # Backward pass
        loss.backward()
        #print(loss)
        # Update weights
        optimizer.step()
        
        running_loss += loss.item()
        if i % 1000 == 999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0
            
print('Finished Training')

The result i am getting is:

Finished Training

But i was expecting something like this (based on the ‘Training a Classifier’ PyTorch tutorial):

[1,  2000] loss: 2.216
[1,  4000] loss: 1.904
[1,  6000] loss: 1.675
[1,  8000] loss: 1.573
[1, 10000] loss: 1.529
[1, 12000] loss: 1.461
[2,  2000] loss: 1.408
[2,  4000] loss: 1.383
[2,  6000] loss: 1.355
[2,  8000] loss: 1.323
[2, 10000] loss: 1.289
[2, 12000] loss: 1.292
Finished Training

Again any advice would be greatly appreciated.

ptrblck · January 26, 2019, 6:19pm

In the tutorial the print statement is a bit different so that it’s not just printing the current size of your data but also the overall data length of your dataset.
You could try to adapt it to your use case.

ahmed · January 26, 2019, 6:33pm

Thanks for the quick reply. I’m sorry but i’m new to python (well programming in general) so I’m not sure that sure exactly what you mean? I’m assuming you are referring to
“[%d, %5d] loss: %.3f’ % (epoch + 1, i + 1, running_loss / 2000))”, where %d, %5d is referring to the data. But I’m not sure how i would adapt it.

ahmed · January 27, 2019, 3:28pm

Update: Thanks I figured out what you meant. It was the batch size. I think it was because I didn’t have 2000 images in the folder i was using for training. Thank you for your help.