Are these the same in effect?

Intel_Novel · March 18, 2019, 6:27pm

I checked some threads, and need some clarification on these:

torch.zeros(100, device="gpu")
torch.zeros(100).to("cuda")

Both of these will end on GPU as far as I know, but is there any difference?

ptrblck · March 18, 2019, 10:02pm

If the tensor does not require gradients, both will yield the same result.
However, if it does, you should stick to the first approach, as the second one will create a non-leaf variable as described here.

Intel_Novel · March 19, 2019, 8:33am

Nice feedback.

What about memory copy operation. Is there any difference in that respect.Which one is more effective?

ptrblck · March 19, 2019, 12:45pm

The first one should be more efficient, as the tensor will be directly created on the device, while the other will be created on the CPU first, then pushed onto the GPU.

Intel_Novel · March 19, 2019, 12:52pm

Great. It is hard for the to think, since I am biased with CPU processing a lot…

This means when we torch.zeros(100, device="gpu") tensor will be created directly in GPU, maybe the CPU will only have the instruction to create that (CUDA instruction).

And in the second case we will create the tensor on CPU first and copy that tensor to GPU (“to” api). Right?