Is it faster to do inference on 1 image on a CPU or GPU?

I wonder if transferring the model and data to the gpu by calling .cuda() produces any significant overhead.

For reference, I’m using a resnet-18, so I also wonder if a small model has a significant advantage of using CPU over GPU. Thanks.

It depends on your GPU and GPU, likely. In my experience, there is a speedup to be had with the GPU even in batch 1 resnet18, but again, I am sure there are combinations of GPU and CPU where it comes out differently.

Best regards

Thomas