Ok, gotcha, well, thank you very much for your help!
glad that this is solved! (and not a PyTorch bug :))
For people who might have the same experience as me, there is a drop down button on the GPU monitor in windows process monitor where you can select cuda to show the actual GPU load while training:
cuda