Which stream will be synchronized when ‘.cuda()’ is called? The current stream or all the streams?

Hi,
I hope to transfor data from cpu to gpu in the current stream, and synchronize the current stream.
So I wonder which stream will be synchronized when ‘.cuda()’ is called? The current stream? or it just act as torch.cuda.synchronize()?

The copy kernel should use the current stream as seen here in copy_kernel_cuda.