Here is what you are trying to prove. Let T1 and T2 be two tensors created by the `torch.randn()`

function, provided the same random seed, with the only difference them between being the moment at which `requieres_grad`

is set to `True`

. That is,

```
seed = 42
# use seed to create the first random tensor
torch.random.manual_seed(seed)
T1 = torch.randn(2,5, requires_grad = True)
# use the same seed to create the second random tensor
torch.random.manual_seed(seed)
T2 = torch.randn(2,5)
T2.requieres_grad_(True) # notice the inplace operation
```

Now, let us perform the exact same operations in both tensors `T1`

and `T2`

. In this way, once we call the `backward()`

method with some different tensor of the same shape as input (in this case I chose a tensor of all ones), both tensors `T1`

and `T2`

should have the same value at their `grad`

atribute.

```
# for T1
x1 = 3 * T1
y1 = x1 + 1
z1 = y1 * y1
# for T2
x2 = 3 * T2
y2 = x2 + 1
z2 = y2 * y2
# calling the backward method
z1.backward(torch.ones_like(z1))
z2.backward(torch.ones_like(z2))
# printing the .grad for T1 and T2
print(T1.grad)
print(T2.grad)
#tensor([[ 12.0604, 8.3186, 10.2203, 10.1460, -14.2114],
# [ 2.6461, 45.7476, -5.4839, 14.3098, 10.8123]])
#tensor([[ 12.0604, 8.3186, 10.2203, 10.1460, -14.2114],
# [ 2.6461, 45.7476, -5.4839, 14.3098, 10.8123]])
```

You get the same value for both.

**HOWEVER**, the intriguing text inside the link you are referring to (url), aims for something different. Let me paste the code that originated the confussion,

```
weights = torch.randn(784, 10) / math.sqrt(784)
weights.requires_grad_()
```

For this case, **it does** matter where `requieres_grad`

is set to `True`

. If you try to set it at the first line, like this

```
weights = torch.randn(784, 10, requires_grad=True) / math.sqrt(784)
```

after arbitatry operations are performed on `weights`

and the `backward()`

method is called, you will see a warning from PyTorch saying that yout are trying to access the `grad`

attribute of a non leaf tensor, so `weights.grad`

is set to `None`

. Why? Becasue in such case, `weights`

does not follow the definition of a leaf tensor: A leaf Variable is a variable that no operation tracked by the autograd engine created it (see this post for further examples). So, what is keeping `weights`

from being a leaf variable? The division by `sqrt(784)`

.

Try it yourself and let me know!