Getting network training with CUDA

krioux5 · July 27, 2020, 8:09pm

Hello,

I have defined a U-net as well as training code, however I don’t think it is training with CUDA (using GPU’s). In my limited experience with tf, I know that as I began training with CUDA, it would output a message stating that CUDA is running, devices were identified etc. Is there something similar for pytorch? I am just going based on the fact that GPU utilization is very low.

I have included a few lines in my code that I thought would encourage CUDA use, but I don’t beieve it’s worked ie.

self.device = torch.device("cuda:0")
torch.set_default_tensor_type("torch.cuda.FloatTensor")

Is there anything else for me to keep in mind? I briefly reviewed CUDA semantics — PyTorch 2.1 documentation. but I am a bit ‘lost in the documentation’

Thanks,

kyle

ebarsoum · July 27, 2020, 8:13pm

Do you have NVidia GPU? The above screenshot shows only Intel integrated GPU.

krioux5 · July 27, 2020, 8:15pm

Sorry, my bad

This screenshot shows the Nvidia GPU. I have made CUDA work in tensorflow before, so I know my CUDA is setup correctly- I assume the issue is related to my code and the way I’m calling it.

Kyle

albanD · July 27, 2020, 8:28pm

Hi,

I think there are know issue with the windows task manager that does not report cuda use of the GPU. And so it is expected to remain close to 0% usage in there.
You can use cli tools like nvidia-smi to see what happens.
Or temps? 71C doesn’t look like an idle card

krioux5 · July 27, 2020, 8:58pm

Training is also extremely slow, leading me to believe cuda actually isn’t working

harris · July 28, 2020, 8:57pm

Hey, I don’t know how the overall utilization of GPU is calculated by windows but you can see the CUDA utilization in the graph area.

ebarsoum · July 28, 2020, 9:49pm

Can you run nvidia-smi?

krioux5 · July 31, 2020, 2:06pm

I switched to my linux machine and am able to tell now with this!