The typical Pytorch training loop is:
for epoch in range(n_epochs):
# Training
for data in train_dataloader:
input, targets = data
optimizer.zero_grad()
output = model(input)
train_loss = criterion(output, targets)
train_loss.backward()
optimizer.step()
# Validation
with torch.no_grad():
for input, targets in val_dataloader:
output = model(input)
val_loss = criterion(output, targets)
Since the gradients are computed by train_loss.backward()
, why do we still need to use with torch.no_grad()
for the validation part? Isn’t the purpose of torch.no_grad() that “Disabling gradient calculation”?