Wait for asynchronous layer to execute on GPU

Good morning,

I am using an implementation of farthest point sample on GPU, and even-though I know it takes a long time to execute, I would like to get the exact time of execution to asses how to improve it. But as GPU operations are asynchronous, I don’t really know how to get the time of execution for this particular part of the code as the blocking will happen layer when the Tensor is used again.
How could I wait for the execution to end and display the execution time just after the GPU function being called ?
Justin

Oops I just found the solution on the documentation under a part I haven’t seen before :
https://pytorch.org/docs/stable/cuda.html
For any body who wants to wait for the GPU code to execute you can use :

torch.cuda.synchronize()

Which will wait for the current kernel to end on the GPU.