Isn’ that the same as model = model.cuda()? I mean I have tried to print out all the tensors device, they are all on cuda:0, but the GPU usage was pretty low and the speed didn’t improve really much as I used cpu
may be try watch nvidia-smi then run the run the code in another terminal to see the usage realtime and I think there is always maximum speedup which you can achieve. Try increasing the batch size which will fit your GPU.
The Windows task manager doesn’t show the “compute” tab by default, so either add it or use nvidia-smi.
Since the GPU Util is >0%, the GPU is used.
You might suffer from other bottlenecks, e.g. data loading, so make sure to store the data on a local SSD.
If that doesn’t help, you would have to profile the code and check, which parts uses most of the runtime.
Thank you! I guess I was expecting it to be super fast when GPU is being used, cuz I heard there should be a huge boost up between gpu and cpu. But when I improve my CNN’s layers and batch size, the GPU usage was still pretty low and the project wasn’t really fast.