Unfortunately I am not allowed to post any details of my CNN (company rules) here. It is a very basic CNN with nothing special in it. I use Adam as optimizer and CrossEntropyLoss as loss function.
Since I have some unbalanced datasets, I use the weight parameter of CrossEntropyLoss to account for that:
and the training is successfull. To use the same code for other, balanced datasets I just set the weights to [1.0, 1.0]. In this case, the network is not being trained at all. When I don’t set the weight parameter at all, it trains fine again. From the definition in the docs, I don’t see what could be wrong, since it would just add a constant factor of 1 to the term: https://pytorch.org/docs/stable/nn.html#crossentropyloss
It is no big problem, because I don’t have to set the weight parameter at all, but still I am asking myself if I maybe understand the weight argument wrong or if my implementation is wrong.
I get the same results with your sample codes. I will investigate my code further and come back for any findings. Still not sure what the reason could possibly be, since the weight argument is the only thing I change between my test loops.