# Variable parts of hidden layers in a network

Hello community,

my question is about a variable output or some parts of a net which can vary. For example a flag would direct which output (or some part of a net) is to choose. That means that I have different hidden layers and with means of a flag it will be decided which is to take. Is there some examples available, could you provide someone.

here a small illustration:

best regards!

You can just pass a flag into your model’s `forward` to chose a certain path:

``````class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.path1 = nn.Linear(5, 10)
self.path2 = nn.Linear(10, 10)

def forward(self, x, path):
if path == 'path1':
x = self.path1(x)
elif path == 'path2':
x = self.path2(x)
else:
print('unknown path')
return x

model = MyModel()
x1 = torch.randn(1, 5)
output1 = model(x1, 'path1')
x2 = torch.randn(1, 10)
output2 = model(x2, 'path2')
``````

One question more, how does learning work (the backpropagation?)

### In very brief (for supervised settting)

1. You do a forward pass, after all the calculations, the final linear layer outputs a vector [1, nc], where `nc` is the number of classes.
2. This gets compared to the true classes, and we get a value of `loss`
3. This loss gets backpropagated.
4. Every parameter in the model gets updated with respect to this loss.
5. You have a better model! Hurray!

### Let’s revise, the major the magic word backpropagation

This is an algorithm where each model parameter is compared to the `loss`.
And we calculate a `gradient` which is basically a number which has direction and magnitude
Now, this direction is indicated by `torch.sign` of this value, (i.e. whether positive or negative)
and moving the model parameter in this direction will increase the value of loss by a factor of `magnitude`.
So, we move in the opposite direction (i.e. negate the `gradient`) and thus decrease the `loss` That was not very brief, but I think it’s clear.

I didn’t ask what the backpropagation is. Your post does not fit to my question.

how does learning work (the backpropagation?)

My bad! Seeing the question mark after backprop. I derived that’s your question.

Since the computation graph will be created dynamically during the forward pass, only the parameters will be updated (and get a valid gradient) which were used to calculate the loss.
I.e. if you only use `path1` during training, only `self.path1` will be upgraded, while `self.path2` keeps its initial values.

1 Like