Which way to optimize is correct?

I have confusions about the autograd of a customed network. Here are two examples:

Example 1

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.linear1 = nn.Linear(5, 10)
        self.linear2 = nn.Linear(10, 5)

    def forward(self, x):
        x = self.linear1(x)
        x = self.linear2(x)
        return x

net = Net()
net1. net = net
net1.fc = nn.Linear(5,10)

Then it’s optimize func. Both net1. net and net1.fc will be sent as the params.

Example 2

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.linear1 = nn.Linear(5, 10)
        self.linear2 = nn.Linear(10, 5)
        self.fc = nn.Linear(5,10)

    def forward(self, x):
        x = self.linear1(x)
        x = self.linear2(x)
        x = self.fc(x)
        return x

net = Net()

Then it’s optimize func.

Which way is correct? It seems if you don’t write the module in the forward func, you will have a wrong gradient.

In the first example your definition of net1 is missing.
It could probably look like this:

class Net1(nn.Module):
    def __init__(self):
        super(Net1, self).__init__()
        self.net = nn.Linear(5, 5)
        self.fc = nn.Linear(5, 10)

    def forward(self, x):
        x = self.net(x)
        x = self.fc(x)
        return x

If this is the case, both models will work just fine.
Your first example looks more like a pre-trained model, where you would like to change the last linear layer to match your number of classes.
The second example is just a vanilla model.

As a small side note: you are missing the non-linearities. In PyTorch the layers do not include any activation functions, so that you would have to add them to your models.

@ptrblck Yes, I just want to use a pretrained model, and replace its final layers with my defined layer. My question is if I modify this model out of the model class (such as Example 1), then the output is calculated as:

output = net1.fc(net1(input))

However, the fc layer is not in the forward function. Will this influence the backward function?

If you are modifying a pre-trained model using model.fc = ..., then this layer should be in the forward method.
Could you post the model definition or a link to it?

@ptrblck This is the model using pretained AlexNet. I just tried to reimplement the network in this figure:
Deep Domain Confusion

class AlexNetFc(nn.Module):

    def __init__(self, pretrained=False, num_classes=1000):
        super(AlexNetFc, self).__init__()
        model_alexnet = alexnet(pretrained=pretrained)
        self.features = model_alexnet.features
        self.classifier = nn.Sequential()
        for i in range(6):
            self.classifier.add_module("classifier" + str(i), model_alexnet.classifier[i])
        self.__in_features = model_alexnet.classifier[6].in_features
        self.nfc = nn.Linear(4096, num_classes)

    def forward(self, x, y):
        x = self.features(x)
        y = self.features(y)
        dist = distance_function(x,y)
        x = x.view(x.size(0), 256 * 6 * 6)
        x = self.classifier(x)
        x = self.nfc(x)
        y = y.view(x.size(0), 256 * 6 * 6)
        y = self.classifier(y)
        y = self.nfc(y)
        return x,y,dist

    def output_num(self):
        return self.__in_features

    def distance_function(x,y):
        return some distance such as Euclidean distance

The code looks good.
From your image you’ve provided it looks like the model is shared, so it should work in this way.

Thank you! I just want to know if it is correct if I moved the distance_function out of the model class, and use the model as Example 1. I get really confused about this.