A simple extension of nn.Sequential

Wesley_Neill · June 17, 2020, 10:56pm

Hi there!

I’m working through some Udacity courses on PyTorch and decided to go the extra mile to extend the nn.Sequential class. I wanted to automate defining each layer’s activations by just passing a tuple containing the number of nodes in each class.

So normally if I wanted to perform a forward pass with an already initialized nn.Sequential model, I’d simply use

out = model(x)
# OR
out = model.forward(x)

Now that I’ve extended the class, I am trying to use

out = self(x)
# OR
out = self.forward(x)

and am getting the following error:

TypeError: forward() missing 1 required positional argument: 'target'

I’ve done nothing to alter the forward method at all, so I’m quite confused. I’d appreciate any help. Thank you!

The full code for my class is below:

class Network(nn.Sequential):
    def __init__(self, layers):
        super().__init__(self.init_modules(layers))
        self.criterion = nn.NLLLoss()
        self.optimizer = optim.Adam(self.parameters(), lr=0.003)


    def train(self, trainloader, epochs):
        for e in range(epochs):
            for x, y in trainloader:
                x = x.view(x.shape[0], -1)
                self.optimizer.zero_grad()
                loss = self.criterion(self(x), y)
                loss.backward()
                self.optimizer.step()

    def init_modules(self, layers):
        # Logic unimportant to the question (I think)

Nikronic · June 18, 2020, 12:29am

Hi,

The problem is that nn.Sequential, gets all the modules in your network and pass your input in .forward to all of those modules. Currently, self.criterion is a module and it is the first module. if you print this code, you get list of modules:

net = Network()
list(net.modules())

# [Network(
#   (criterion): NLLLoss()
# ), NLLLoss()]

which as you can see, nn.Sequential pass x to NLLLoss and obviously this module needs a target too! And after this, next module will be optimizer which does not accept any input at all.

The problem is you are defining a full model(loss, optimizer, etc) by extending nn.Sequential which is not correct. If you want to extend nn.Sequential you need to follow its own patterns.

Here is an instance:

class Network(nn.Sequential):
    def __init__(self, layers=(nn.Conv2d(1, 3, 3), nn.Conv2d(3, 10, 3)), act='relu'):
        super(Network, self).__init__()
        self.init_modules(layers, act)
    def init_modules(self, layers, act):
        for idx, module in enumerate(layers):
            self.add_module(str(idx), module)
            if act == 'relu':
                self.add_module(str(idx)+'act', nn.ReLU())

Module list in this case:

[Network(
   (0): Conv2d(1, 3, kernel_size=(3, 3), stride=(1, 1))
   (0act): ReLU()
   (1): Conv2d(3, 10, kernel_size=(3, 3), stride=(1, 1))
   (1act): ReLU()
 ),
 Conv2d(1, 3, kernel_size=(3, 3), stride=(1, 1)),
 ReLU(),
 Conv2d(3, 10, kernel_size=(3, 3), stride=(1, 1)),
 ReLU()]

PS. note the line with super, you have to use same structure. super(MyModule, self).__init__()

Bests

Wesley_Neill · June 18, 2020, 1:24am

I have to admit that I’m still confused. I’m sorry, I’m just learning OOP in python. I want my extended class to have two extra attributes that are not modules for feedforward.

Those two attributes are the optimizer and the loss criterion. I don’t want them to be part of the layer/module pipeline. I just want them as attributes for use in my training method.

An additional problem is that I don’t want to have to initialize the the object with pre-existing modules. Rather, I want to declare the object with the layers tuple, and have the modules dynamically generated the modules based off that tuple. In your example, you have initialized the object with two Conv2d layers, which is the opposite of what I am trying to achieve.

Where can I declare attributes so that they DO NOT end up as modules? I thought all unique child attributes were supposed to be declared in the __init__ method. After declaring the unique child attributes, I thought you called super().__init__() in order to have the parent class finish filling out the object.

Nikronic · June 18, 2020, 2:15am

Can you use a psuedocode or a python code to show wha you mean by this, I cannot figure it out by tuple of layers, what are the values in tuple?

Wesley_Neill · June 18, 2020, 2:48am

I was able to get my code to work removing self.optimizer = optim.Adam() and self.criterion = nn.NLLLoss() in the constructor, as you suggested. I had to make them normal variables in the training method rather than class attributes. That’s a little bit of a bummer, but at least I have working code

To answer your question:

The ‘layers’ tuple is one where each value in the tuple is the number of nodes for that corresponding layer:

layers = (n_nodes_layer1, n_nodes_layer2, ...)

So now, the below code works fine. I had to move the optimizer and the criterion into the training method as regular variables rather than as class attributes. It is less flexible this way, but works now:

class LSequential(nn.Sequential):
    def __init__(self, layers):
        super().__init__(self.init_modules(layers))

    def init_modules(self, layers):

        n_layers = len(layers)
        modules = OrderedDict()

        # Layer definitions for input and inner layers:
        for i in range(n_layers - 2):
            modules[f'fc{i}'] = nn.Linear(layers[i], layers[i + 1])
            modules[f'relu{i}'] = nn.ReLU()

        # Definition for output layer:
        modules['fc_out'] = nn.Linear(layers[-2], layers[-1])
        modules['smax_out'] = nn.LogSoftmax(dim=1)

        return modules

Nikronic · June 18, 2020, 3:55am

Now that makes sense. I was thinking how can one create a network combined of Conv1d and Conv2d and few other layers using just tuple of numbers of in out channels.

Let me disagree with you on this topic. Actually, one of the most fascinating things about PyTorch is its modularity and in this case, you are building a customized model, a network which will be represented as a computational graph. This model can be used for different tasks with different purposes. So, embedding loss and optimizer within train method makes it less modular. For instance if I want to use your implementation for my own data, I had to change source code directly rather than passing the final model and its outputs to a customized loss or optimizer.
If you go through more in PyTorch’s source code, you will enjoy its modularity!

If you are interested in creating high level classes and wrapping all those steps into single class, I think reading source code of PyTorch Lightening and Ignite would be a good idea.

Wesley_Neill · June 20, 2020, 6:33pm

Agreed! As i’m learning, I’m being astounded by the flexibility of PyTorch. You can customize everything (I don’t know much a bout neural nets yet, so I’m sure I’ll continued to be blown away!)

I meant flexibility in programming style. I’m coming from a Java background, where the parent cannot see child specific attributes unless they are past to super(). That means I have to tip-toe around with child/parent relationships because unless I REALLY know what’s going on in the parent, I can’t trust it to not mess with the child in unexpected ways.

But, Python is not Java, so I’m okay with learning a new style. And this inheritance weirdness spurred me to look at the source code for nn.Sequential, and for a change I actually was able to understand what was going on. Normally, source code only confuses me more.

All in all, PyTorch has been an awesome experience.