Custom Loss Function RCE


I’m trying to implement a custom loss function that is a regularization of the standard cross entropy loss function. Essentially I’m adding a penalty parameter. I was wondering the best way to implement this loss function - as a nn.Module, or just a def myloss() type. I can provide specifications for the function if necessary.

Thanks in advance!

If your custom loss function uses internal parameters or some other arguments, I would use the nn.Module approach as it will properly encapsulate all attributes.
On the other hand you could just define a method, if you are using a pure functional approach.


Thanks for the input! I think I’ll go with the nn.Module approach. If I do this, I shouldn’t need to define a backward function, only a forward, correct?

If you are using PyTorch methods only (no numpy etc.) you can just define forward, as Autograd will automatically create the backward.

Before implementing the RCE (adding penalty parameter), I wanted to make sure my model works with the regular cross entropy function (the RCE required for my model only guarantees smoothness in probability distribution - when penalty parameter is set to 0 in my custom loss, it is the normal CE loss function). However, I don’t believe I have the data in correct format. When calling:

nn.CrossEntropyLoss()(out, target.squeeze())

I get a run time error:

RuntimeError: multi-target not supported at /pytorch/aten/src/THCUNN/generic/

A little background:
My data looks like this:

1.0, 72
0.9741266547569558, 69
0.8379395394315619, 68
0.6586391650305975, 77
0.8268797607858336, 55
0.1315648101238786, 69
0.016174165344821745, 74
0.09840399297894489, 87
0.6690187519456767, 65
0.38906138226255316, 68

The data is representative of: y,C(dy), where y is the output of a defined mathematical function, and C(dy) is the class to which each point should belong. (With this specific data set, I have 144 classes.)

dy is generated by taking y[i+1] - y[i]. The number of classes for the entire data set is:

int(np.floor(abs(dy.amax - dy.amin) / 0.04)).

The 0.04 is just a specified classification ‘bin’ width.

The line before I call the loss function, my out is a tensor of size [20,100], and my target is a tensor of size [20,100] (after a target.squeeze().

Any ideas?

Edit: Fix formatting.

After further investigation, I did not have the correct class. I ended up removing everything relating to numpy and used pure pytorch methods. I fixed the loss using CEL by calling:

loss = nn.CrossEntropyLoss()(out, torch.max(target,1)[1])

However, I’m not quite sure what changing the targets argument actually did.