I think you are experiencing what is called mode-collapse. There is an amazing paper by founder of pytorch themselves for mitigating this problem.
I think you are experiencing what is called mode-collapse. There is an amazing paper by founder of pytorch themselves for mitigating this problem.