Variable parts of hidden layers in a network

Hello community,

my question is about a variable output or some parts of a net which can vary. For example a flag would direct which output (or some part of a net) is to choose. That means that I have different hidden layers and with means of a flag it will be decided which is to take. Is there some examples available, could you provide someone.

here a small illustration:

best regards!

You can just pass a flag into your model’s forward to chose a certain path:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.path1 = nn.Linear(5, 10)
        self.path2 = nn.Linear(10, 10)
        
    def forward(self, x, path):
        if path == 'path1':
            x = self.path1(x)
        elif path == 'path2':
            x = self.path2(x)
        else:
            print('unknown path')
        return x

model = MyModel()
x1 = torch.randn(1, 5)
output1 = model(x1, 'path1')
x2 = torch.randn(1, 10)
output2 = model(x2, 'path2')

Thank you for your answer!

One question more, how does learning work (the backpropagation?)

In very brief (for supervised settting)

  1. You do a forward pass, after all the calculations, the final linear layer outputs a vector [1, nc], where nc is the number of classes.
  2. This gets compared to the true classes, and we get a value of loss
  3. This loss gets backpropagated.
  4. Every parameter in the model gets updated with respect to this loss.
  5. You have a better model! Hurray!

Let’s revise, the major the magic word backpropagation

This is an algorithm where each model parameter is compared to the loss.
And we calculate a gradient which is basically a number which has direction and magnitude
Now, this direction is indicated by torch.sign of this value, (i.e. whether positive or negative)
and moving the model parameter in this direction will increase the value of loss by a factor of magnitude.
So, we move in the opposite direction (i.e. negate the gradient) and thus decrease the loss :smile:

That was not very brief, but I think it’s clear.

Also read this: https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b

I didn’t ask what the backpropagation is. Your post does not fit to my question.

how does learning work (the backpropagation?)

My bad! Seeing the question mark after backprop. I derived that’s your question.

Since the computation graph will be created dynamically during the forward pass, only the parameters will be updated (and get a valid gradient) which were used to calculate the loss.
I.e. if you only use path1 during training, only self.path1 will be upgraded, while self.path2 keeps its initial values.

1 Like