[Reporting bug] INTERNAL ASSERT FAILED at "C:/w/b/windows/pytorch/aten/src\\ATen/native/cuda/Reduce.cuh":929, please report a bug to PyTorch

Thanks for reply,
I really want to, but my code very complicated (or messy) at this point so I’m not sure I could do that.

But what I can say is, this is happening alternatively with another memory issue.

A (normal case). Normal model size, batch size=2 per each GPU (takes <50% of total gpu memory)
B (CUDA out of memory). Normal model size, batch size=3or2 per each GPU(takes <<70% of total gpu memory) link: [CUDA out of memory] How to reserve memory in GPU?
C (This case). Larger model size, batch size=1 per each GPU (takes <50% of total gpu memory)

So I think this is something related to the memory issue, but I have no clue.

Thanks,