I have used the source code in this, which is an official source code in Pytorch website in a remote server for training a classifier. The device is cuda and during the training the GPU usage is verified. However, the usage of CPU jumps to more than 300% when I monitor it using htop. I have even tried to feed the model random tensors and the same happens (omits the possibility of DataLoader as a bottleneck). Any idea as a solution to keep the usage of CPU low would be appreciated. I use pip for pytorch (conda is not allowed in the server) and the version is the latest (1.12).