Why doesn't backward() work on nonscalar outputs?

For example,

t1 = torch.Tensor([1, 3])
t1.requires_grad_(True)
t2 = t1 * t1

t2.backward() # fails

Why can’t it just put the Jacobian matrix in t1?

Pytorch is designed in a way that you try to calculate the error between predicted and ground truth sum the error and try to minimise that sum. In deep learning literature, that’s how backprop has been defined.