Using of data.to(device ) in pytorch

.to(x) as states in Docs moves the tensor from current device to the device x, i.e. cpu ("cpu") or gpu ("cuda:0", "cuda:1" …). It’s useful, because you can specify x in the beginning of code and from there do not care whether it is cpu or gpu, and just move all your tensors, models etc. to it. The alternative is to call .cpu() or cuda(), but it lacks the flexibility of .to(). There is awesome explanation of this in the official PyTorch Docs here.

about .zero_grad() you can find extensive discussion here.

2 Likes