Could you post “small” tensors, which would reproduce these wrong results?
Here is a small example using different reduction settings with and without using ignore_index.
Could you post “small” tensors, which would reproduce these wrong results?
Here is a small example using different reduction settings with and without using ignore_index.