I’m trying to compute derivatives of functions in a TensorFlow-like fashion. Consider the following code:

```
import torch
x = torch.linspace(-10., 10, 10000, requires_grad=True)
y = x**2
y.backward(torch.ones_like(x))
g = x.grad
```

It works just fine, with `g`

being the gradient, such that `f(x) = x**2`

and `f'(x) = 2x`

.

But if I set the device to my GPU by using `x = x.cuda()`

, it does not work, giving the warning:

```
<stdin>:1: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at C:\cb\pytorch_1000000000000\work\build\aten\src\ATen/core/TensorBody.h:491.)
```

And then `g`

is `None`

. Why is that? How can I fix it?