Combining CrossEntropyLoss with MSEloss

iariav · March 15, 2018, 2:01pm

Hi,
I’m currently working on a semantic segmentation problem where I want to classify every pixel in my input image (256X256) to one of 256 classes. I currently use the CrossEntropyLoss and it works OK.

in my specific problem, the 0-255 class numbers also have the property that mistaking between 5 and 6, for instance, is not as “bad” as mistaking 5 and 200. meaning, mistaking “close” classes is not as bad as mistaking “far” classes. thus, I thought of adding a second loss to my system, an L2 loss, meaning MSEloss.

however, the output of my network for input\label of size (Batch X 256 X 256) is (Batch X 256 X 256 X 256), and so I can’t use MSEloss(out, label) directly. also, as I understand it, taking the argmax on the first dim and then using the MSEloss will render it undifferentiable.

is there any way to get around it?
thanks in advance…

tom · March 15, 2018, 2:55pm

One way of incorporating an underlying metric into the distance of probability measures is to use the Wasserstein distance as the loss - cross entropy loss is the KL divergence - not quite a distance but almost - between the prediction probabilities and the (one-hot distribution given by the labels) A pytorch implementation and a link to Frogner et al’s paper is linked below.

An alternative could be to use the expected square error loss per pixel. EDIT: The notebook doean’t solve the per pixel problem. For that you might use a Rubner style (http://robotics.stanford.edu/~rubner/papers/rubnerIjcv00.pdf) approach to have label as a third dimension, or you might just use MSE…
Best regards

Thomas

github.com

t-vi/pytorch-tvmisc/blob/master/wasserstein-distance/Pytorch_Wasserstein.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "cell_id": "016D036D46DD4C408D48EFA828B4E77C"
   },
   "source": [
    "# Batch Sinkhorn Iteration Wasserstein Distance\n",
    "\n",
    "Thomas Viehmann\n",
    "\n",
    "This notebook implements sinkhorn iteration wasserstein distance layers.\n",
    "\n",
    "## Important note: This is under construction and does not yet work as well as it should."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,

This file has been truncated. show original