Imagine that I have a network class called myModel. I instantiated in the following fashion
net = mYModel()
In my training loop, I zero out the gradient before calling the forward pass.
for x, y in examples:
net.zero_grad()
loss = some_loss_function(net.forward(x), y)
loss.backward()
optimizer.step()
The previous code seems to work just fine. However, if instead of calling the forward pass, I invoke the call method, the gradients are not getting register. Look at the following code for reference:
for x, y in examples:
net.zero_grad()
loss = some_loss_function(net(x), y)
loss.backward()
optimizer.step()
The second code does not work and the backpropagation is happening with a zero gradient, so there is not a change in the network params. It is worth mentioning that I am training only one model and the optimizer is referenced to the network params.
Is this behavior intended?