non_blocking=True reserves extra memory in GPU 0

ManiadisG · September 12, 2021, 6:26pm

I am running experiments using DDP and I observed that, when setting non_blocking=True to move the batch to the GPUs, memory is reserved at GPU 0, even if it is not being used by the experiment (in which case utilization remains at 0%). When non_blocking=False, this does not happen.

Is this normal? What might be causing it?

ptrblck · September 13, 2021, 2:55am

I’m unable to reproduce the additional memory usage of GPU0 using the DDP example and adding non_blocking=True to the to() operation, so could you post a minimal, executable code snippet, please?