@ptrblck_de sir can you help
You are using
model.eval() which is used to evaluate your model. When using
model.eval() Pytorch will not calculate gradients for your network. As the error clearly tells you params.grad is
None. Therefore you have no gradients at that parameter and can’t continue with
sum(). I also don’t believe that even without the
model.eval that you will get gradients without a Feedforward step
model.eval() does not change the gradient calculation, but switches the behavior of some modules to evaluation mode, such as dropout (will be disabled) and batchnorm (will use the running stats instead of the batch stats).
However, as @RupertP explained, the
.grad attributes are
None after initializing the model and will just be populated after the first
It seems you haven’t performed this step yet, so you would have to move the print statement after the first training iteration.
PS: You can post code snippets by wrapping them into three backticks ```, which makes debugging easier.
how will i perform the first training iteration on a loaded model. sorry if it seems like a silly question. i am a bit new in this field and do not have deep knowledge
After creating the model instance and loading the checkpoint, you would have to perform a forward and backward pass using the model:
# Model creation model = MyModel() # Checkpoint loading model.load_state_dict(state_dict) # Forward pass output = model(data) # Loss calculation loss = criterion(output, target) # Backward Pass loss.backward() # Now you can check the gradients
I would recommend to take a look at the tutorials which explain this workflow in more detail.
output = model(data)
what is data here?
is data some dummy input?
data would be whatever your model expects.
If can be a random input for debugging or your real data created by a
Can you please help sir regarding this error?
Please post the code here, instead of a screenshot.
Secondly, the error is in the MSE loss. What is the target in your code, I don’t see any of its definition. It seems that shape of the output and target going into the MSE loss is not consistent.
I did not want to mention this, but I have to.
When you ask a question and you expect other members spend time to help you, they expect same thing. So, you can help others by using properly formatted code instead of screen shots, providing relevant data such as stack trace instead of just final error, not mentioning specific members as discourages other people, wrapping relevant issues into single post rather than chat form, etc or most important thing, reading community guidelines and intro tutorials.
In your case, I suggest you to first read the faqs and complete the basic user tutorial of forum Discourse.