In the PyTorch beginner tutorial, why do we need to specify a **vector** as input for the **backward()** function to calculate the **gradient/derivative** of a **vector-valued output y** with respect to a vector input variable x, as shown below?

It seems that in the example below the vector v will “**scale**” the output of **x.grad**. So, why is this the case? Can we just specify v as a “**dummy vector**” composed of all ones [1.0, 1.0, 1.0] to avoid the “scaling”?

```
x = torch.randn(3, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)
print(x.grad)
```