log_probs might be detached from the computation graph. Check, if it’s .grad_fn attribute is pointing to a valid function or None (detached). In the latter case, print the .grad_fn attribute of the intermediate tensors in your model’s forward method to check which operation detaches the tensor from the computation graph. Often this is done by e.g. rewrapping a tensor via x = torch.tensor(x), using another library such as numpy, explicitly calling tensor.detach() etc.
RuntimeError Traceback (most recent call last)
<ipython-input-98-7b6b8391c42e> in <cell line: 1>()
----> 1 trainer.fit(model, data_module)
29 frames
/usr/local/lib/python3.10/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
198 # some Python versions print out the first line of a multi-line function
199 # calls in the traceback and some print out the last line
--> 200 Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
201 tensors, grad_tensors_, retain_graph, create_graph, inputs,
202 allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I also have mentioned here my Collaboratory, please let me know incase you can not access the file.