The error is raised, if you try to call .backward() on a non-scalar tensor (i.e. a tensor with more than a single element). If that’s the desired use case, you would have to provide the gradients manually e.g. via .backward(gradient=torch.ones_like(loss)) or reduce the loss before (e.g. via loss.mean().backward()).
@albanD explains it in this post with more detail.
PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier 