I love the amazing Pytorch docs, but one issue we ran into is that they seem a bit sparse on the topic of how loss functions, specifically Cross Entropy, is implemented (i.e. mathematically and/or the raw code).
Does someone have an insight into where I would go about finding that?
Context: We are implementing a custom loss function that can incorporate class weights in “distribution-based classes” (not quite sure what the specific term is).
(Sorry if this is a out-of-scope question – feel free to refer me to where a question like this would be more appropriate)