Any analog of torch.cuda.syncronize() in C++ API?

Nikita_Petrenko · February 9, 2019, 11:09am

I’ve been trying to record batch processing time, but stumbled upon asynchronous execution of torch operations. I know that I can sync with forward pass by calling loss.item() for example, but that looks ugly

Is there any analog of torch.cuda.syncronize() in C++ API?

albanD · February 11, 2019, 10:12am

Hi,

You can call directly the cudaDeviceSynchronize() method as is done by the python code here.

Nikita_Petrenko · February 11, 2019, 12:06pm

That’s nice, thank you!

Do you know if CPU code is also executed asynchronously?

albanD · February 11, 2019, 12:40pm

No all cpu code is synchronous !