Feedback loop in training is not obvious

yaitskov · March 24, 2023, 3:49pm

Hi,

In math loss function gets just 2 arguments expected and real function values and returns the error, which is used by optimizer together with gradients to tweak NN parameters.

In PyTorch examples loss object doesn’t get reference to NN and has magic method backward(), which apparently doing optimization, but it doesn’t have arguments nor returned value.

Is there a global variable?

criterion = nn.NLLLoss()

def train(category_tensor, line_tensor):
    hidden = rnn.initHidden()

    rnn.zero_grad()

    for i in range(line_tensor.size()[0]):
        output, hidden = rnn(line_tensor[i], hidden)

    loss = criterion(output, category_tensor)

Following line is

    loss.backward()

    # Add parameters' gradients to their values, multiplied by learning rate
    for p in rnn.parameters():
        p.data.add_(p.grad.data, alpha=-learning_rate)

    return output, loss.item()

ptrblck · March 24, 2023, 8:40pm

That’s not the case as backward calculates the gradients. The optimizer.step() method is updating the parameters using the previously calculated gradients or as seen in your case you are manually updating the parameters.
This tutorial might be helpful.

yaitskov · March 25, 2023, 12:08am

So loss calculation is optional for training?!

ptrblck · March 25, 2023, 12:55am

You would calculate the loss first and call .backward() on it afterwards in common use cases. However, you could also call backward() directly on the model output if it would fit your use case.

yaitskov · March 30, 2023, 1:09pm

Everything became clear for me after watching sections Autograd and Backpropagation in Deep Learning with PyTorch course. Indeed PyTorch Tensors are magic (not just a wrapper around fast C array) and they track all their history of computations via requires_grad option, so every operand gets gradient automatically including neuron weights!