Very low GPU power usage

You could check if your GPUs are reducing their clocks as they might otherwise overheat. If so, you might want to check the cooling solution and improve it.
Also, you could profile your use case (e.g. via Nsight Systems) to see if the GPUs might be bottlenecked by other code parts and might be waiting (e.g. in a NCCL call).

1 Like