Thanks for your reply. @googlebot
In fact I run these codes in CPU already, but it shows the result above.
Besides that, I rerun the reduction of none serveral times with same seed, it prove the same result and same loss.
From my observation for the result returned by reduction of none, it seems that it doesn’t cooperate the class weight info.
So I wander is it a bug or other problem?