# How to change the loss function for different layers

Hi,

I am trying to recreate this paper: Continuous Learning in Single Incremental Tasks

And one of the algorithms in the paper involves training the final layer w.r.t to a loss function (eg. cross-entropy), but learning all preceding layers using a regularization term in addition to the back-propagated loss. So I know that adding a regularization term for the entire network is done by just adding the term as a Variable to the loss functions value during the step:

`loss += regularization_term`
`loss.backward()`
`opt.step()`

but according to my understanding that modifies the loss for the entire network, thus am stuck on how to change the loss only for all layers except the final layer.

1 Like

Not necessarily. The regularization term will modify all parameters which are is its computation graph.
Have a look at this dummy example:

``````model = nn.Sequential(
nn.Linear(1, 1, bias=False),
nn.Sigmoid(),
nn.Linear(1, 1, bias=False)
)

criterion = nn.MSELoss()
x = torch.randn(1, 1)
target = torch.ones(1, 1)

output = model(x)
loss = criterion(output, target)
loss.backward()

print('Before regularization')

output = model(x)
loss = criterion(output, target)
loss = loss + torch.norm(model.weight)
loss.backward()

print('After regularization')
As you can see, the gradient for `lin2` stays the same, while the gradient for `lin1` changes.
This is due to the fact, that the parameters of `lin2` were not involved in creating the regularization term, so they won’t be touched.