Does the PyTorch profiler serialize computations on cuda?

This might be related to 176120 as well