Grad can be implicitly created only for scalar outputs in call to torch.autograd.backward

Hello,

I am new to PyTorch and deep learning. I have this piece of code I have written to build a text classifier based on what I read in the book NLP with Transformers.

When I run this code (and train method is called which is the last line in the code snippet) I get this error:

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
grad can be implicitly created only for scalar outputs
  File "/llm/.env/lib/python3.10/site-packages/torch/autograd/__init__.py", line 88, in _make_grads
    raise RuntimeError("grad can be implicitly created only for scalar outputs")
  File "/llm/.env/lib/python3.10/site-packages/torch/autograd/__init__.py", line 193, in backward
    grad_tensors_ = _make_grads(tensors, grad_tensors_, is_grads_batched=False)
  File "/llm/.env/lib/python3.10/site-packages/torch/_tensor.py", line 487, in backward
    torch.autograd.backward(
  File "/llm/.env/lib/python3.10/site-packages/transformers/trainer.py", line 2753, in training_step
    loss.backward()
  File "/llm/.env/lib/python3.10/site-packages/transformers/trainer.py", line 1940, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/llm/.env/lib/python3.10/site-packages/transformers/trainer.py", line 1664, in train
    return inner_training_loop(
  File "/llm/transformers-ch03/classifier.py", line 106, in <module>
    trainer.train()
  File "/usr/local/Cellar/python@3.10/3.10.2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/Cellar/python@3.10/3.10.2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,
RuntimeError: grad can be implicitly created only for scalar outputs

and I am not sure what is causing it or how to debug and fix it. Can anyone please help me here? I am just following the code samples in the book.

S.

Based on the error message it seems you are trying to call backward on a tensor containing multiple values so you would either have to reduce it or pass the gradient explicitly:

x = torch.randn(1, 10, requires_grad=True)
lin = nn.Linear(10, 10)

out = lin(x)

# fails since out contains multiple values and expects a gradient input
out.backward()
# RuntimeError: grad can be implicitly created only for scalar outputs

# works
out.mean().backward()

thanks for the response. i fixed it.