How to implement squared hinge loss

Hi everyone,
I need to implement the squred hinge loss in order to train a neural network using a svm-like classifier on the last layer.
It is an image classification problem on cifar dataset, so it is a multi class classification. The idea is to have multiple “scores” for each output neuron of the network and the network should be trained minimizing the sum of the losses for each class.
Is there an already implemented version of this loss? If not, what is the easiest way to implement it?
Thank you!

Hi, in the end I implemented a version of the multi label squared hinge loss.
It makes sense in my head, but it goes to zero after the first epoch. And I can’t figure out why.
I used batches of size 64 from the dataloader.
the labels are one hot encoded
here is the code

class MultiClassSquaredHingeLoss(nn.Module):
def init(self):
super(MultiClassSquaredHingeLoss, self).init()

def forward(self, output, y): #output: batchsize*n_class
    n_class = y.size()[1]
    #margin = 1 
    margin = 1
    #isolate the score for the true class
    y_out = torch.sum(torch.mul(output, y)).cuda()
    output_y = torch.mul(torch.ones(n_class).cuda(), y_out).cuda()
    #create an opposite to the one hot encoded tensor
    anti_y = torch.ones(n_class).cuda() - y.cuda()
    loss = output.cuda() - output_y.cuda() + margin
    loss = loss.cuda()
    #remove the element of the loss corresponding to the true class
    loss = torch.mul(loss.cuda(), anti_y.cuda()).cuda()
    loss = torch.max(loss.cuda(), torch.zeros(n_class).cuda())
    #squared hinge loss
    loss = torch.pow(loss, 2).cuda()
    #sum up
    loss = torch.sum(loss).cuda()
    loss = loss / n_class        
    return loss

I would really appreciate your help

How does the model output look compared to the target, if you reach the zero loss?
I haven’t checked the implementation carefully, but could you use a sample output and target to check the validity of the loss?

hi, thank you for answering.
I think I messed up something concerning the fact that outputs come in batches, because changing some stuff it gives me problems with output and target dimensions.