A question about running time and CUDA

I used net.cuda() and Variable.cuda() in my code and there is no other .cuda() appeared. I finetuned the ResNet-50 on Cifar-10, and I only changed the strides of the net to fit the 32*32 images. The batch_size I used is 128. The GPU is NVIDIA TITIAN X. And I found that it only did 4 iterations in 10min. I wonder whether it’s a little slow? And I want to know whether the .cuda() I used is enough?

the .cuda() you used is enough, but it is better to do:

Variable(x.cuda()), where x is the input data, rather than Variable(x).cuda()

To check whether GPU is used or not, you can run the command nvidia-smi and look at output.