I am currently working on profiling, learning about torch profiler and tensorboard using it. I am using this tutorial : PyTorch Profiler With TensorBoard — PyTorch Tutorials 2.1.0+cu121 documentation.
The thing is that I tried it using google colab & my own local computer that has a RTX2080. When I run the exact tutorial code with colab I am obtaining a similar report, telling me about GPU stuff…
Yet when I run on my machine (after a call of .cuda() I checked that my models & data were on cuda and according to pytorch they are) I am obtaining the following result :
This seems to indicate that all the operation performed were performed on CPU (and the device is CPU), while my device is cuda.
I believe that it might be because I think I did not install cudatoolkit on my machine (but I did install torch using pip), but can this line :
device = cuda if torch.cuda.is_available() else cpu
return cuda if cudatoolkit is not installed on my machine ? Is it possible to put models & data on cuda and still perform operation on cpu ?
And if it is and if I am performing operation on cpu : I noticed that it is still way quicker to have model & data on cuda than on cpu, how can it be quicker ?
Thank you for your time and help