Allocate all memory and reuse cache mem

Please avoid posting the same message multiple times…
I answered in the other post that asked the same question here.

The short answer is pytorch is not built to help handle the sharing of GPU but to use them as efficiently as possible. The hack that you use here is a hack and will have side effects. Like making your program OOM in cases where it sometimes does not.
If you want more details on why we can’t fix the memory fragmentation problem in pytorch, you can check this thread.