Problem with image/dataloader

bapriddy · April 11, 2019, 3:39pm

First, I am wondering how the mnist dataloader works. It loads images from say
~/mnist/data/processed/training.pt (the default mnist code from pytorch/examples). Then in the code it calls data, target from train_loader…So how is the data split and is it done automatically?

    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % args.log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))

Secondly, how do i set up a loader with my own images? I have a working model but not sure how it splits the data (or if it does at all). Here’s my code…

simple_transform = transforms.Compose([transforms.Resize((32,32)),transforms.ToTensor()])
train = ImageFolder(w_path,simple_transform)
train_loader= torch.utils.data.DataLoader(train, shuffle=True, batch_size=20, num_workers=4, pin_memory=True)

# Model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, 1, 1)
        self.bn1 = nn.BatchNorm2d(32)
        self.dp1 = nn.Dropout(p=0.25)
        self.conv2 = nn.Conv2d(32, 64, 3, 1, 1)
        self.bn2 = nn.BatchNorm2d(64)
        self.dp2 = nn.Dropout(p=0.25)
        self.fc1 = nn.Linear(4*4*64, 128)
        self.fc2 = nn.Linear(128, 46)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = self.bn1(x)
        #x = self.dp1(x)
        x = F.relu(self.conv2(x))
        #x = self.bn2(x)
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1,4*4*64)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = Net().to(device)
       
# choose optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)   
#optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# choose loss function
#loss_fn = torch.nn.MSELoss(reduction='sum')
#loss_fn = torch.nn.L1Loss(reduction='mean')
loss_fn = torch.nn.SmoothL1Loss(size_average=None, reduce=None, reduction='mean')    
    
#device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 



def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.float().to(device), target.float().to(device)
        optimizer.zero_grad()
        output = model(data)
        #loss = F.nll_loss(output, target)
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()
        print(loss.item())
        #if batch_idx % log_interval == 0:
        #    print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
        #        epoch, batch_idx * len(data), len(train_loader.dataset),
        #        100. * batch_idx / len(train_loader), loss.item()))

My network model is slightly different than the pytorch/examples/mnist code from github. I have about 2000 images in total but in the example above it just pulls 46 images from the ~/data directory which contains 46 images (not all 2000).

Model out put below…

        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 32, 64, 64]             896
       BatchNorm2d-2           [-1, 32, 32, 32]              64
            Conv2d-3           [-1, 64, 32, 32]          18,496
            Linear-4                  [-1, 128]         131,200
            Linear-5                   [-1, 46]           5,934
================================================================
Total params: 156,590
Trainable params: 156,590
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.05
Forward/backward pass size (MB): 1.75
Params size (MB): 0.60
Estimated Total Size (MB): 2.40
----------------------------------------------------------------
0.004562674555927515
0.0014697645092383027
0.0011804916430264711
0.0007775453268550336
0.00042929680785164237
0.00036733646993525326
0.00031455850694328547
0.0003125681832898408
0.00028487335657700896
0.0002894191420637071
0.00021117702999617904
0.00016488203254994005
0.00014666165225207806
0.00011506529699545354
0.00011069038737332448
9.281547681894153e-05
9.236995538230985e-05
7.372548134298995e-05
7.20455645932816e-05
5.869231972610578e-05

jmaronas · April 11, 2019, 4:08pm

Hi.

I think the code from github is quite illustrative. Take a look at the download method and the init method mnisttorchvision

For your second question take a look at this

Hope it helps

bapriddy · April 11, 2019, 6:02pm

Yes, very helpful. Thanks!!