Understanding Net class

I’m looking at the convnet example in 60-min blitz tutorial, and it’s not obvious to me how the layers were grouped into init and forward functions. For example, why convolutional layers are in init, but pooling layers are in forward?
Where should I define a dropout layer, or a batchnorm layer? Is this explained somewhere?

class Net(nn.Module):      
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 10)
        
    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)    
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        
        return x

Thanks!

4 Likes

Things with weights are created and initialized in __init__, while the network’s forward pass (including use of modules with and without weights) is performed in forward. All the parameterless modules used in a functional style (F.) in forward could also be created as their object-style versions (nn.) in __init__ and used in forward the same way the modules with parameters are.

8 Likes

Thanks, this makes sense! However I noticed this example:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(2304, 256)
        self.fc2 = nn.Linear(256, 17)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(x.size(0), -1) # Flatten layer
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.sigmoid(x)

Do you know why did they define conv2_drop layer in the __init__?

1 Like

No particular reason. Sometimes it’s convenient to use the nn. version of a parameterless module, like if you pass a dropout ratio into __init__ and want to use that for all dropout layers in your model.

@michaelklachko That was my very first network in PyTorch so keep in mind that I was testing and wasn’t really knowing what I was doing ;). Hence why my dropouts are inconsistent, one in the forward and one in the init.

So, now that you know what you’re doing :wink: how would you do it?

Also, I noticed the second dropout is from F (not from nn), how are they different? I see that F.dropout takes ‘training’ argument, but nn.dropout doesn’t. How does nn.dropout knows if we’re doing training or testing?

The nn.Module is an interface to autograd that provides a lots of goodies while creating neural networks. There is also a functional interface torch.nn.functional. Compare both examples:

import torch.nn as nn
import torch.nn.functional as F
# creates a module which initializes weights etc
conv = nn.Conv2d(1, 1, 3)
# uses the weights and params initialized in the constructor
output = conv(input)

# use the functional interface, need to create weights beforehand
output2 = F.conv2d(input, weight) 

One of the functiinalities that comes with nn.Module is the hability to set if in training or evaluation mode, and the modules directly use that information inside. But the functional interface is agnostic to nn.Module, so you need to specify all the arguments to the function, including the training mode for functions that behave differently in train / eval

2 Likes

I see! Thanks. It would be nice to have this info in the tutorial.

2 Likes

Thanks, This explanation is very helpful for starters. Can you please add this to pytorch tutorial.

What happens when one calls Net.__init__()?
Is it like resetting the net to its default state?

I need a way to reset a net completely as it was just defined and I wondered if this a valid way.