Method for efficiently transferring non-autograd tensors to CPU from GPU?

@henryald Alternatively, you can use torch.profiler, which does the syncing for you :slight_smile:

For example: Model() uses GPU but backwards() doesn't - #3 by neoncube