According to the documentation: “Waits for all kernels in all streams on a CUDA device to complete.”
So my question is if this is needed (or good practice) before starting the validation phase (only one GPU is used).
Thanks.
According to the documentation: “Waits for all kernels in all streams on a CUDA device to complete.”
So my question is if this is needed (or good practice) before starting the validation phase (only one GPU is used).
Thanks.
No, you don’t need to manually synchronize your code unless e.g. you want to profile the code or if you are using custom CUDA streams and want to synchronize the entire device. PyTorch will use the default stream and thus no explicit synchronizations are needed.