Getting network training with CUDA

Hello,

I have defined a U-net as well as training code, however I don’t think it is training with CUDA (using GPU’s). In my limited experience with tf, I know that as I began training with CUDA, it would output a message stating that CUDA is running, devices were identified etc. Is there something similar for pytorch? I am just going based on the fact that GPU utilization is very low.
image
I have included a few lines in my code that I thought would encourage CUDA use, but I don’t beieve it’s worked ie.

self.device = torch.device("cuda:0")
torch.set_default_tensor_type("torch.cuda.FloatTensor")

Is there anything else for me to keep in mind? I briefly reviewed CUDA semantics — PyTorch 2.1 documentation. but I am a bit ‘lost in the documentation’

Thanks,

kyle

Do you have NVidia GPU? The above screenshot shows only Intel integrated GPU.

Sorry, my bad


This screenshot shows the Nvidia GPU. I have made CUDA work in tensorflow before, so I know my CUDA is setup correctly- I assume the issue is related to my code and the way I’m calling it.

Kyle

Hi,

I think there are know issue with the windows task manager that does not report cuda use of the GPU. And so it is expected to remain close to 0% usage in there.
You can use cli tools like nvidia-smi to see what happens.
Or temps? 71C doesn’t look like an idle card :smiley:

1 Like

Training is also extremely slow, leading me to believe cuda actually isn’t working

Hey, I don’t know how the overall utilization of GPU is calculated by windows but you can see the CUDA utilization in the graph area.

1 Like

Can you run nvidia-smi?

1 Like

I switched to my linux machine and am able to tell now with this!