Hey Guys
I am currently building a multi-task ML model using both F.binary_cross_entropy and F.cross_entropy. Since my datasets are quite unbalanced i am implementing class weights and giving them as argument to both functions to combat this. However, i am experiencing something quite peculiar when adding it to F.cross_entropy. It seems to have absolutely no effect on the loss, see below (code ran in VSC debugger):
F.cross_entropy(out_AU_intensities[i][AU_idx], lab[AU_idx] - 1, weight = self.cw_int[i])
>tensor(1.3951, grad_fn=<NllLossBackward0>)
F.cross_entropy(out_AU_intensities[i][AU_idx], lab[AU_idx] - 1)
>tensor(1.3951, grad_fn=<NllLossBackward0>)
And the actualy rescaling is as follows:
self.cw_int[i]
>tensor([5.5556e+00, 2.8571e+01, 1.0000e+05, 1.0000e+05])
To me it looks like it simply is not scaling the weights. However, it does work when i call the F.binary_cross_entropy:
F.binary_cross_entropy(out_AU, AUs, weight = self.cw_AU)
>tensor(20765.5312, grad_fn=<BinaryCrossEntropyBackward0>)
F.binary_cross_entropy(out_AU, AUs)
>tensor(0.6949, grad_fn=<BinaryCrossEntropyBackward0>)
self.cw_AU
>tensor([1.0000e+00, 1.0000e+05, 1.2500e-01, 1.0000e+05, 1.0000e+05, 1.0000e+05,
4.0000e-02, 1.6667e-01, 1.6667e-01, 3.3333e-01, 1.0101e-02, 1.9231e-02])
Has anyone else experienced this problem, or am i just miss-understanding the implementation