Bi-LSTM CRF Loss function on pytorch tutorial page

This is the link
http://pytorch.org/tutorials/beginner/nlp/advanced_tutorial.html#bi-lstm-conditional-random-field-discussion

I am a little puzzled by the way the loss function is written, which is as follows,

    feats = self._get_lstm_features(sentence)
    forward_score = self._forward_alg(feats)
    gold_score = self._score_sentence(feats, tags)
    return forward_score - gold_score

should not it be a L-1 or L-2 norm or similar of the difference between ideal score and calculated score, instead ? This loss will go to negative very quickly, which is not exactly what is I presume is right ? Can anybody explain ? Thanks!

This is because the loss function actually is logarithm over probability, so it should never be negative. It helps me find out a bug in my own implementation. This is the paper: https://arxiv.org/pdf/1603.01360.pdf