Should it really be necessary to do var.detach().cpu().numpy()?

If var requires gradient, then var.cpu().detach() constructs the .cpu autograd edge, which soon gets destructed since the result is not stored. var.detach().cpu() does not do this. However, this is very fast so virtually they are the same.

2 Likes