Merging networks

James_e · March 24, 2020, 11:49am

I have a network such as

class NeuralNet(nn.Module):

    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(10, 5)
        self.fc2 = nn.Linear(5, 2)
    def forward(self, x):
        x = nn.functional.relu(self.fc1(x))
        x = nn.functional.relu(self.fc2(x))
       
        return x

How can I do losses if i want to split them up? e.g



class Layer1(nn.Module):

    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(10, 5)
    def forward(self, x):
        x = nn.functional.relu(self.fc1(x)
        return x

class Layer2(nn.Module):

    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc2 = nn.Linear(5, 2)
    def forward(self, x):
        x = nn.functional.relu(self.fc2(x)
        return x

The only loss is from the final layer where it compares to a label.

How can you call .backward() on this?

ptrblck · March 25, 2020, 4:21am

You could just call these model in a sequential manner (same as in NeuralNet):

layer1 = Layer1()
layer2 = Layer2()

x = torch.randn(1, 10)
target = torch.randint(0, 2, (1,))

out = layer1(x)
out = layer2(out)

loss = criterion(out, target)
loss.backward()

James_e · March 25, 2020, 3:26pm

@ptrblck Will loss.backward() calculate the gradient for all parameters in layer 1 and layer 2?

ptrblck · March 26, 2020, 1:55am

Yes, Autograd will track all operations and will make sure to compute the gradients for all used parameters.
You can treat “models” as a standard layer.
In fact, your custom models and PyTorch’s layers both derive from nn.Module, so you are able to chain models as you would do with layers.