This is the link
http://pytorch.org/tutorials/beginner/nlp/advanced_tutorial.html#bi-lstm-conditional-random-field-discussion
I am a little puzzled by the way the loss function is written, which is as follows,
feats = self._get_lstm_features(sentence)
forward_score = self._forward_alg(feats)
gold_score = self._score_sentence(feats, tags)
return forward_score - gold_score
should not it be a L-1 or L-2 norm or similar of the difference between ideal score and calculated score, instead ? This loss will go to negative very quickly, which is not exactly what is I presume is right ? Can anybody explain ? Thanks!