Pytorch: Cuda synchronize out of memory

These other sessions likely have cached blocks (we use a caching memory allocator).

When you initially do a CUDA call, it’ll create a cuda context and a THC context on the primary GPU (GPU0), and for that i think it needs 200 MB or so. That’s right at the edge of how much memory you have left.

1 Like