Execution Time Measurement in GPU

I would like to measure the execution time of my code using PyTorch on the GPU. I have tried using the following code, but I noticed that the execution time is similar to when I run the code on the CPU using starttime = time.time() . Could you please guide me if there are any other ways to measure the execution time accurately on the GPU? Thank you in advance.

start_event = torch.cuda.Event(enable_timing=True)
end_event = torch.cuda.Event(enable_timing=True)

    start_event.record()
    imnoisy = imnoisy.to(device)
    nsigma = nsigma.to(device)
    outim = torch.clamp(
        imnoisy[:, :1, :, :] - model(imnoisy, nsigma).to(device), 0., 1.)
    end_event.record()
    torch.cuda.synchronize()
    execution_time = start_event.elapsed_time(end_event) / 1000  # Convert to seconds
    print(f"Execution time: {execution_time} seconds")