In the autograd doc Automatic differentiation package - torch.autograd — PyTorch 2.1 documentation, I found the following examples:
I wonder what is the difference between b and e here? Why is e not a Tensor created by the operation that casts a cpu Tensor into a cuda Tensor, just like b?
Thanks!