Weight argument equivalence of TF loss in PT CrossEntropyLoss

blade · December 31, 2019, 8:43pm

I’m trying to extend a binary classification problem to multi-class and implement it using PyTorch. The implementation of the binary problem is available in TensorFlow framework. The cost function is defined as

cost = norm * tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(logits=preds_sub, targets=labels_sub, pos_weight=pos_weight))

To train the network in my PyTorch implementation, I use CrossEntropyLoss to compare the output scores of classes from the network with target labels. My question is if pos_weight argument of TensorFlow’s weighted_cross_entropy_with_logits is the same as weight argument in PyTorch’s CrossEntropyLoss. TF documentation states that

pos_weight allows one to trade off recall and precision by up- or down-weighting the cost of a positive error relative to a negative error.

My understanding is that in the case of CrossEntropyLoss, I’m not differentiating between false classifications (e.g. a class C being classified as A has the same weight as being classified as B) and for weight argument of CrossEntropyLoss to do the same thing as weighted_cross_entropy_with_logits, I think that I need some sort of modification to the weights, like a weight matrix of size (class number) X (class number) but I’m not sure. I would appreciate any comments that might clarify this.