Train a simple convnet on cifar10

Maria_Pap · December 14, 2017, 3:34pm

Hello,

I am trying to learn how to use PyTorch. I am having some difficulties using the data loaders.

Can anyone tell me if my code looks ok?

Because during training the output tensors are all zero and loss is always the same.

I was wondering if there is something wrong with the way I am loading the data. Generally, I want to load the data by myself, not via torchvision.datasets. (I am loading the inputs-targets from a file that is meant to be used in torch, that is why I am converting label values from 1-10 to 0-9).

Thank you.

Here is the code:

from future import print_function
import argparse
import torch
import torch.utils.data as utils_data
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable
import numpy as np
import torchfile

class MyConvNet(nn.Module):

def __init__(self):
    super(MyConvNet, self).__init__()
    self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=0)
    self.pool1 = nn.MaxPool2d(kernel_size=3, stride=2)
    self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=0)
    self.pool2 = nn.MaxPool2d(kernel_size=3, stride=2)
    self.fc1 = nn.Linear(1600, 500)
    self.fc2 = nn.Linear(500, 10)

def forward(self, input):
    x = self.pool1(F.relu(self.conv1(input)))
    x = self.pool2(F.relu(self.conv2(x)))
    x = x.view(x.size(0), -1)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    return x

net = MyConvNet()
print(net)
model=MyConvNet()
model.cuda()

patches=torchfile.load(’/data1/04_MP/test_pytorch/cifar-10-batches-t7/train.t7’)
labels=torchfile.load(’/data1/04_MP/test_pytorch/cifar-10-batches-t7/train_t.t7’)
for i in range(labels.shape[0]):
if labels[i]==1:
labels[i]=0
if labels[i]==2:
labels[i]=1
if labels[i]==3:
labels[i]=2
if labels[i]==4:
labels[i]=3
if labels[i]==5:
labels[i]=4
if labels[i]==6:
labels[i]=5
if labels[i]==7:
labels[i]=6
if labels[i]==8:
labels[i]=7
if labels[i]==9:
labels[i]=8#
if labels[i]==10:
labels[i]=9
tensor_input=torch.from_numpy(patches)
tensor_target=torch.from_numpy(labels)

training_samples = utils_data.TensorDataset(tensor_input.float(), tensor_target.long())
data_loader = utils_data.DataLoader(training_samples, batch_size=100, shuffle=True)

optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
criterion=nn.CrossEntropyLoss()

for epoch in range(40):
print(’\nEpoch: %d’ % epoch)
model.train()
train_loss=0.0
for i, (inputs, targets) in enumerate(data_loader,0):
inputs, targets = inputs.cuda(), targets.cuda()
inputs, targets = Variable(inputs), Variable(targets)
optimizer.zero_grad
outputs=model(inputs)
print (outputs)
loss=criterion(outputs, targets)
loss.backward()
optimizer.step()
train_loss += loss.data[0]
print(’[%d, %5d] loss: %.3f’ % (epoch+1, i+1, train_loss / 500))
print(‘Finished Training’)

apsvieira · December 14, 2017, 4:11pm

Hi,
The dataloader definition looks OK. Have you tried printing the inputs and targets inside your loop?
Have you checked that your dataset definition is correct? You could try accessing one dataset element with sample = training_samples[0] to check if the output is as expected.

Also, I believe you could vectorize that labels conversion by performing labels = labels - 1

Maria_Pap · December 14, 2017, 4:23pm

Hi,

Thank you for replying to me!!

Yes I printed the inputs, targets and outputs inside my loop. The results are:

inputs: [torch.cuda.FloatTensor of size 100x3x32x32 (GPU 0)] (with values 0-255)
targets: [torch.cuda.LongTensor of size 100 (GPU 0)] (with values 0-9)

I also printed what you suggested (training_samples[i]) and the result for i=0 (i checked for other 'i’s as well) for example showed something like this:
(0 ,.,.) =
59 43 50 … 158 152 148
16 0 18 … 123 119 122
25 16 49 … 118 120 109
… ⋱ …
208 201 198 … 160 56 53
180 173 186 … 184 97 83
177 168 179 … 216 151 123

(1 ,.,.) =
62 46 48 … 132 125 124
20 0 8 … 88 83 87
24 7 27 … 84 84 73
… ⋱ …
170 153 161 … 133 31 34
139 123 144 … 148 62 53
144 129 142 … 184 118 92

(2 ,.,.) =
63 45 43 … 108 102 103
20 0 0 … 55 50 57
21 0 8 … 50 50 42
… ⋱ …
96 34 26 … 70 7 20
96 42 30 … 94 34 34
116 94 87 … 140 84 72
[torch.FloatTensor of size 3x32x32]
, 6)

which seems fine.

Is there a possibility that the range 0-255 is inappropriate for pytorch?

Maria_Pap · December 14, 2017, 4:25pm

I forgot to write about the outputs.

They are like this… :

Variable containing:
0 0 0 … 0 0 0
0 0 0 … 0 0 0
0 0 0 … 0 0 0
… ⋱ …
0 0 0 … 0 0 0
0 0 0 … 0 0 0
0 0 0 … 0 0 0
[torch.cuda.FloatTensor of size 100x10 (GPU 0)]

apsvieira · December 14, 2017, 4:30pm

I believe you should use a range [0, 1] for the image data. A simple normalization should suffice.
Everything else looks OK,