Custom loss function with trainable parameters

Hi everyone,
I need help.
I am trying to create a custom loss function with two trainable parameters.

class MyCustomLoss(nn.Module):
    def __init__(self, my_parameter1, my_parameter2):
       super(MyCustomLoss, self).__init__()
	   self.A = my_parameter1
	   self.B = my_parameter2
    def forward(self, inputs, targets):
	   y_hat_softmax = F.softmax(inputs, dim=1)
	   t = torch.argmax(y_hat_softmax, dim=1)
	   
	   ...some custom code1...
	   
	   MyLoss = self.A*(some custom code2) + self.B*(some custom code3)
	   
	   return MyLoss

For the first prototyping of the idea, I used brute-force training of values A and B between 0 and 1
with some small steps (0.05). I have proof that the idea works, but I would like to get the best
values for A and B - that the network learns these weights.

Do you have any suggestions with the code snippet? Also, I would like to ensure that A and B are between 0 and 1, or at least, positive values.

Thank you

1 Like

To make A and B positive, an easy way is to apply ReLU to them before multiplying with the loss, i.e. MyLoss = torch.relu(self.A)*(some custom code2) + torch.relu(self.B)*(some custom code3). Another option is to apply torch.exp to A and B, and this is a common trick people use in training VAE (to make the predicted variance positive)

To make it between 0, 1, similarly, you can apply a sigmoid on A and B

@Dazitu616 thank you! It works with that approach.
I would like to get the final value of the A and B at the end. How is it possible?

Thank you

Just print it out? Or load the saved state_dict of the model and find their values?

Thank you, @Dazitu616, for your suggestion.
I added A and B in the optimizer in this way:

optimizer = torch.optim.Adam(list(model.parameters())+list(criterion.parameters()), lr=learning_rate)

It seems to me that printing state_dict is the easiest way. However, the following code does not list A and B:

# Print model's state_dict
print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

print()

# Print optimizer's state_dict
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

Do you have a suggestion on how to approach these values from the state_dict()?

Thank you

Oh, sure. I think you need to manually include these two weights to the model by register_params. For example:

class Model(nn.Module):

    def __init__(self):
        A = torch.tensor(0.5)
        self.register_parameter('A', A)

Then you can call it via self.A. And this A will be in the model.state_dict()

Thank you, @Dazitu616.
I am a little bit confused. I have registered these parameters inside the custom loss class (as in the first post of my question). So why is there a need to repeat them inside the module class?
I used this in my custom loss class:

self.A = torch.nn.Parameter(torch.tensor(0.0, requires_grad=True))

Thank you

IC, sorry for the confusion. Since you have registered them inside the custom loss class, I think you can print the state_dict of the loss to see A, i.e. criterion.state_dict()['A']

Thank you, @Dazitu616.
Yes, that is the correct syntax - I get the values. However, I get just the initial values (0.0). Do you have any idea where the catch is?

Thank you

This is just my guess: since you are minimizing the loss using an optimizer, so if A goes to 0, then the loss will be 0, which is exactly the smallest possible value. So I guess it’s not a good idea of doing so (optimizing the loss scale). Instead, I think you can have some constraints, e.g. A + B == 1, so that the optimizer cannot find the trivial solution by setting A and B both to 0. Instead, it should find a balance which benefits the model. Just my random thought though.

Thank you, @Dazitu616. Your suggestion was the correct one. The initial value can’t be 0.0 but some other value, 1.0, for instance.
I have noticed that training gives a negative value for the A or B parameter for some initial values. So I used a code to have the loss always positive:

MyLoss = torch.relu(self.A)*(some custom code2) + torch.relu(self.B)*(some custom code3)

But is there a way to ensure that a raw value of A is just positive during the training?

I’m afraid not. There is no constraints to make them positive right? Maybe you can try with different things torch.relu, torch.abs. But that’s another question.

Thank you, @Dazitu616, for your suggestions.