Understanding Pytorch Profiler for GPU


I wanted to understand how the PyTorch profiler relates CPU-bound tasks with GPU-bound tasks in a trace. I understand that timestamps are used for this purpose, but how are the timestamps generated for the GPU-bound tasks? I do see that the CUPTI from NVIDIA is used but I wanted to understand the exact APIs used to generate timestamps or to correlate the two. If any can point me in the right direction, that would be really helpful!