Hi all.
I’m new to Pytorch. I’m trying to build my own classifier. I have a dataset with nearly 30 thousand images and 52 classes and each image has 60 * 80 size.
This is my network (I’m not sure about the number of neurons in each layer).
class my_network(nn.Module):
def __init__(self, class_num, act=F.relu):
super(my_network, self).__init__()
self.layer1 = nn.Linear(1 * 60 * 80, 50 * 30 * 40)
self.act1 = act
self.layer2 = nn.Linear(50 * 30 * 40, 70 * 10 * 15)
self.act2 = act
self.layer3 = nn.Linear(70 * 10 * 15, 90 * 5 * 8)
self.act3 = act
self.layer4 = nn.Linear(90 * 5 * 8, 80)
self.act4 = act
self.layer5 = nn.Linear(80, class_num)
def forward(self, x):
x = x.view(x.size(0), -1)
x = self.layer1(x)
x = self.act1(x)
x = self.layer2(x)
x = self.act2(x)
x = self.layer3(x)
x = self.act3(x)
x = self.layer4(x)
x = self.act4(x)
x = self.layer5(x)
return x
I’m using Cuda for my model, CrossEntropyLoss for my criterion, and SGD for my optimizer.
model = my_network(len(classes))
model = model.to(device)
learning_rate = 0.01
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
I use the following code for training my model.
for epoch in range(num_epochs):
train_loss = 0.
for images, labels in train_loader:
images = images.to(device)
labels = labels.to(device)
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
print(loss.item())
train_loss += loss.item()
average_loss = train_loss / len(train_loader)
And when I run this, I get nan in output. The loss.item() returns nan in the first epoch.
nan
nan
nan
nan
...
also, I don’t want to use normalization for my data and I want to use them in this manner.
what am I doing wrong?