Use model parameters in loss function


Let’s say I have one trained neural network and want to train another one with the exact same topology.
In the 2nd network’s loss function I’ll have a base loss function like MSE and I want to extend it and add something else to the loss. This “something” is the similarity between both networks’ parameters.

For now I just defined similarity as 1 / sum(abs(old model - new model)). So if the networks had the exact same parameters this value would be infinity. The loss function function is supposed to force the 2nd network to learn something different.

How would I go about doing this? I obviously have to write a custom loss function that adds this term in the forward function. I’m a little confused however on how to make sure autograd computes the correct gradient.
autograd has to somehow understand the network parameters’ influence on this additional loss term in order to compute the correct gradient.

The pameters() function returns a generator that I can iterate through to get a network’s parameters. These wouldn’t be variables though.
Should I rather get the linear layers from the model directly and access the model parameters via the .weight attribute?

I think my main problem is that I don’t really understand how autograd works and when exactly autograd can figure out the gradient.

Any help would be appreciated!

Interesting idea!
I think you can try it, although I’m not sure, if your current similarity calculation is really useful.
Your criterion would yield a low loss, if the weights are just scaled with a constant, but represent the same transformation.

However, I’ve created a small example with some dummy data, which should give you a starter code for your experiments:

# Create simple model
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(2, 4)
        self.fc2 = nn.Linear(4, 1)
    def forward(self, x):
        x = F.relu(self.fc1(x))
        return F.sigmoid(self.fc2(x))

# Define similarity criterion
def weight_criterion(modelA, modelB):
    loss = 0
    for paramA, paramB in zip(modelA.parameters(), modelB.parameters()):
        loss += 1. / torch.sum(torch.abs(paramA.detach() - paramB))
    return loss

# Dummy data (negative input - 1, positive input - 0)
x =, 2) - 1, torch.randn(10, 2) + 1), 0)
y =, ), 1), torch.full((10, ), 0)))

modelA = MyModel()
modelB = MyModel()

optimizerA = optim.SGD(modelA.parameters(), lr=1e-2)
optimizerB = optim.SGD(modelB.parameters(), lr=1e-2)

criterion = nn.BCELoss()

# Train modelA as the reference model
for epoch in range(20000):
    output = modelA(x)
    loss = criterion(output, y)
    print('Epoch {}, loss {}'.format(epoch, loss.item()))

# Train modelB with weight_loss
for epoch in range(20000):
    output = modelB(x)
    bce_loss = criterion(output, y)
    weight_loss = weight_criterion(modelA, modelB)
    loss = bce_loss + weight_loss
    print('Epoch {}, BCELoss {}, weight_loss {}'.format(
        epoch, bce_loss.item(), weight_loss.item()))

# Compare parameters of both models
for paramA, paramB in zip(modelA.parameters(), modelB.parameters()):
    print('A: {},\nB: {}'.format(
        paramA, paramB))

Let me know, if this is helpful and what you experienced in your experiments!


Wow thanks! I’ll try this out as soon as I get home tonight.
Yeah the similarity metric is probably not gonna work like this. I would rather measure something like correlation between the parameters. I could also measure the “similarity” by comparing hidden layer outputs instead of parameters. Definitely a lot to experiment with!


So I tried this with MNIST + Cross Entropy Loss. What you probably guessed is that I’m trying to build neural network ensembles whose individual learners are as diverse as possible. I’m evaluating this by comparing their outputs. For example the “accuracy” for one learner, given the output of another learner.

As expected with this similarity cost function I couldn’t see any difference in my evaluation to just using plain cross entropy loss. I don’t have a working cuda installation at the moment though, so I had to do the testing on my laptop cpu. I couldn’t really use a lot of networks and couldn’t train them for long. I think I’ll try to fix the cuda installation errors first before I do any further testing.

I think as my next similarity measure I’ll use 1 / (cross entropy loss of the 2nd model, given the outputs of the first model).

So yeah, the original questions I had about autograd are pretty much solved. Thanks again!

Could you explain your approach of calculating the loss of one model given the outputs of another one?
I’m not sure, how you would implement it.

Also, unrelated to your question, but if you need to create an ensemble, you could try snapshot ensembles. Basically you apply a cyclic learning rate schedule and save some good snapshots on the way. You can find the learning rate schedulers here.

That looks cool, i’ll look into it! I don’t really need to use ensembles for anything, I’m just doing this for fun atm.

I think I made the loss thing sound more clever than it actually is.
Usually one computes cross entropy between the network output and the actual labels. And the goal is to minimize this loss. Additionally now, I compute the cross entropy loss between the outputs of my already trained network and my new 2nd network and try to maximize this loss with my 2nd network.
So essentially I’m telling the network: “If there are 2 amost equally good predictions, choose the one that’s different from what the first network predicted.” Well, at least that’s what im trying to do.

So in order to archieve this I could either subtract this 2nd loss from my normal cross entropy loss or I do the (1 / 2nd loss) - thing and add it to my first loss. The goal is the same: Higher loss when the 2 networks’ predictions are equal.

This is how I thought I’d do it. Can’t test it at the moment. I still have to somehow tell the old net that it doesnt need gradients, but I’ll figure that out.

        ce_loss = nn.CrossEntropyLoss()
        optimizer = torch.optim.Adam(new_model.parameters(), lr=0.0001)
        # strength of additional loss
        strength = 0.001

        for inputs, targets in train_loader:
            old_output = old_model(inputs)
            new_output = new_model(inputs)
            standard_loss = ce_loss(new_output, torch.max(targets, 1)[1])
            additional_loss = ce_loss(new_output, torch.max(old_output, 1)[1])
            total_loss = standard_loss + strength * (1/additional_loss)
            # or: total_loss = standard_loss - strength * additional_loss