Empty state_dict

Carson · September 30, 2018, 6:26pm

Hello,

I’m a n00b trying to make a simple neural network for the MNIST dataset. My goal is to make this net have a variable number of layers and activation functions chosen by the user.

I have pytorch 0.4.1 with no cuda on my laptop

So first I set off to define my own module, which had an empty parameter list and I read on stack overflow that I could just overwrite the parameters method by adding all the parameters of each Linear layer to one list. I also made a one-hot encoding function.

import torch

def encoding_function(num_categories):
“”“one hot encoder, given the number of categories returns a one-hot
encoder function the eats tensors of size (N,1) (where N is batch size
and each row corresonds to one numerical value) and spits out a
tensors of size (N,num_categories) where each row corresponds to
the one hot encoding associated to the numerical value”“”

def encoder(vals):
    """vals of size (N,1), each row is a one value in 
    [0,num_categories-1]"""
    batch_size = vals.size(0)
    encoded_vals = torch.zeros((batch_size, num_categories))
    
    for i in range(batch_size):
        val = vals[i].item()
        encoded_vals[i][val] = 1
        
    return encoded_vals
    
return encoder

class NNet(torch.nn.Module):

def __init__(self, layer_shapes, activation_functions):
    super(NNet, self).__init__()
    assert len(layer_shapes) == len(activation_functions) + 1
    self.layer_shapes = layer_shapes
    self.activation_functions = activation_functions
    
    linear_functions = list()
    for i in range(len(self.layer_shapes)-1):
        linear_functions.append(torch.nn.Linear(
                self.layer_shapes[i], self.layer_shapes[i+1]))
    
    self.linear_functions = linear_functions

def parameters(self):
    parameters = list()
    for function in self.linear_functions:
        parameters = parameters+list(function.parameters())
    
    return parameters 

def forward(self, x):
    y = x
    for i in range(len(self.layer_shapes)-1):
        assert y.shape[1] == self.layer_shapes[i]
        y = self.activation_functions[i](self.linear_functions[i](y))
    return y

data:

import matplotlib.pyplot as plt
import torch
import torchvision
import ai.neural_network as nnet

batch_size = 30
epochs = 500
learning_rate = 0.001

train_set = torchvision.datasets.MNIST(root = ‘/home/carson/Desktop/Archive/Computing/Projects/Python/Data’,
train=True,
transform=torchvision.transforms.ToTensor(),
download=True)
test_set = torchvision.datasets.MNIST(root = ‘/home/carson/Desktop/Archive/Computing/Projects/Python/Data’,
train=False,
transform=torchvision.transforms.ToTensor(),
download=True)

train_loader = torch.utils.data.DataLoader(dataset=train_set, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_set, batch_size=batch_size, shuffle=False)

model = nnet.NNet([784, 16, 10], [torch.nn.Tanh(), torch.nn.Softmax(dim=1)])
loss_function = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

training:

loss_items = list()

for t in range(epochs):
for i, (images, labels) in enumerate(train_loader):

    encoder = nnet.encoding_function(10)
    labels = encoder(labels)
    
    images = images.reshape(-1,28*28)
    outputs = model(images)
    
    
    loss = loss_function(outputs, labels)
    
    if i%1000 == 0:
        loss_items.append(loss.item())
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

plot it:

plt.figure(figsize=(15,5))
plt.plot(loss_items)

test it:

with torch.no_grad():
correct = 0
total = 0

for images, labels in test_loader:
    images = images.reshape(-1, 28*28)
    outputs = model(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum()

print('Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))

results:

So it came up 90% accurate for me which I felt was good enough for this network.
I wanted to save it, but it says

model.state_dict() => OrderedDict()

is empty

I haven’t found anything about this yet, so I was sondering if anybody knew off the back how I could fix this. I do have the parameters in a list, so there is that. Not sure if I can manually set those with any fresh network.

Also, I’m new to pytorch, like started two days ago. So if there is any recommendation for doing something in a better way, please let me know

InnovArul · September 30, 2018, 7:13pm

Go through this post:

Carson · October 2, 2018, 9:31pm

Thank you so much, this solved my problem!