I’ve just started using pytorch and am working on my first project. What got me confused was the eval() function. I followed a starting tutorial that showed how to use pytorch and it did not contain using eval() function for validation. So I trained a model with a satisfactory validation performance, but then came across eval and train modes when looking something up. As far as I understand, not using it would only affect how accurate my validation actually is (because dropouts and normalizations would still be used for validation), but it would not affect the training itself. However, when I added the model.eval() and model.train() lines between validation, I noticed that the training loss for the same amount of epochs is now different?

```
for i in range(epochs):
i += 1
start = time.time()
y_pred = model(categorical_train_data, numerical_train_data)
single_loss = loss_function(y_pred, train_outputs)
aggregated_losses.append(single_loss)
print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')
optimizer.zero_grad()
single_loss.backward()
optimizer.step()
with torch.no_grad():
y_val = model(categorical_test_data, numerical_test_data)
loss = loss_function(y_val, test_outputs)
```

The code above produces

```
epoch: 1 loss: 0.80006623
epoch: 2 loss: 0.78904432
```

however, if I change it to

```
for i in range(epochs):
i += 1
start = time.time()
y_pred = model(categorical_train_data, numerical_train_data)
single_loss = loss_function(y_pred, train_outputs)
aggregated_losses.append(single_loss)
print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')
optimizer.zero_grad()
single_loss.backward()
optimizer.step()
model.eval()
with torch.no_grad():
y_val = model(categorical_test_data, numerical_test_data)
loss = loss_function(y_val, test_outputs)
model.train()
```

I get a different output:

```
epoch: 1 loss: 0.80006623
epoch: 2 loss: 0.78863680
```

I used torch.manual_seed(0) so I know it’s not the initial distribution. I can run the code multiple times and get the same output for both these cases.

Which as far as I understand means that weights of the model were updated. Does that mean that without using eval mode I let the validation set influence the training of the actual model and therefore the validation performance is not actually validation?