Curious about why variable.cuda() is slower than gradient descient

Binxuan_Huang · July 26, 2017, 8:44pm

Hi,

I am profiling my training code to detect the performance bottleneck. I found that the variable.cuda() operation takes much more time than doing the actual gradient descent(74.1% vs. 13.6%).

Is there any specific reason for this?

Thanks

Binxuan_Huang · July 29, 2017, 1:52am

Is there anyone knows?

ruotianluo · July 29, 2017, 4:05am

Did you synchronize?

Binxuan_Huang · July 29, 2017, 5:12am

No, I didn’t choose any option about synchronize.

ruotianluo · July 29, 2017, 7:29am

I mean, you should use torch.cuda.synchronize() to get the “true” time.

Binxuan_Huang · August 10, 2017, 5:29pm

oh, got it. Thank you!