I am working on optimizing CUDA program’s performance. I used
torch.backends.cudnn.benchmark to optimize performance and
torch.cuda.synchronize() to synchronize CUDA applications in pytorch. To do the same job in tensorflow I searched a lot time whether similar code is in tensorflow, however I could’nt find anything.
I wonder is there any equivalent code to
torch.backends.cudnn.benchmark exists in tensorflow.
I’m not sure, if you’ll get the best answer about Tensorflow in this discussion boad and I would recommend to use their discussion platforms (stack overflow and github issues).
That being said, may I ask, how you’re optimizing the performance using
Thanks! Actually, I found that tf.sess.run is always synchronized. So tensorflow doesn’t need CUDA synchronization.
I was running multiple processes using CUDA MPS, so it was essential to ensure every process is synchronized.
Thank you for your reply