# Can I implement a custom loss function like this?

Dear Pytorch forum, I was wondering if you could help me with an implementation of a custom loss function. In particular, I want to perform some calculations on the model weights during my loss function - for example L1 penalty. My boundary condition is that my loss function is written as a nn.Module. Why? Because I wish to use Skorch for hyper parameter tuning, as Skorch only accepts nn.Module loss module as inputs (at least this is to the best of my understanding. I am aware, that if I wrote my NN in a Pytorch-typical loop, this problem would be much easier as I wouldn’t need to write a class for my loss, however, that way hyper parameter tuning is more cumbersome.

Assume I have a NN defined as follows:

``````
import torch
import torch.nn as nn

class NeuralNet(nn.Module):
"""
trivial network, please do not read too much into it - its a matter of principle.
Architecture: some trivial constant number of nodes for each HL, given number of nodes and HL.
Parameters:
------------------
input_dim: dimension of input layer, hl: number of hidden layers, nodes: number of nodes per hl
"""
def __init__(self, input_dim, hl, nodes):
self.input_dim = input_dim
self.hl = hl
self.nodes = nodes
self.layers = nn.ModuleList()
for h in range(hl):
if h == 0:  # input layer
self.layers.append(nn.Linear(input_dim, nodes))
self.layers.append(nn.ReLU())
else:
self.layers.append(nn.Linear(nodes, nodes))
self.layers.append(nn.ReLU())
self.layers.append(nn.Linear(nodes, 1))
self.model = nn.Sequential(*self.layers)

def _get_weights():
"""
Gets model weights only and ignores biases
note: w = [*layers.parameters()] would include biases I think
"""
w = []
for name, param in self.layers.named_parameters():
if 'weight' in name:
w.append(param)
return w

def forward(self, x):
"""
Unusual, because it returns 2 elements. why? because I want the weights to be passed to my loss function.
"""
w = self._get_weights()
x = self.model(x)
return x, w

class RegMSE(nn.Module):
"""
Custom loss function.
Example: perform L1 penalty on weights. Please ignore if this would make sense, it's a matter of principle.
"""
def __init__(self, l1):
super(RegMSE, self).__init__()
self.l1 = l1

def forward(self, y_pred, target):
"""
y_pred now includes the weights also from the forward pass
"""
# unpack y_pred
y_hat, weights = y_pred
# calculate typical MSE loss
loss = 1/(y_hat.shape[0])*torch.sum((target-y_hat) ** 2)
l = torch.tensor([0], dtype=torch.float32)
# L1 penalty
if self.l1 is not None:
for p in weights:
l += torch.norm(p, 1)
loss += self.l1 * l[0]
return loss
``````

Now in a training loop, we could have a situation where:

``````model = NeuralNet(input_dim=10, hl=2, nodes=10)
criterion = RegMSE(l1=1e-4)
``````

``````out = model(X)
loss = criterion(out, Y)
loss.backward()
optimizer.step()
``````

would this work to implement a L1 penalty? Meaning, would my weights be trained accordingly?

Because consequently, I could use it in skorch:

``````from skorch import NeuralNetRegressor

net = NeuralNetRegressor(
module=NeuralNet,
module__input_dim=X.shape[1],  # Input shape
module__nodes=10,  # Number of nodes in first hidden layer
module__hl=5,  # Number of hidden layers
criterion=RegMSE,  # Custom Loss Function
criterion__l1=1e-4,  # L1 weight penalty
batch_size=20, # Batch size
max_epochs=100, # Epochs