I am a little bit confused as to how to calculate the train and valid_loss, searching the forums I think this might be the correct answer but I am still posting this qeustion as a kind of sanity check.

I first define my loss function, which has the default value reduction = “mean”

```
criterion = nn.CrossEntropyLoss()
```

Then I accumulating the total loss over all mini-batches with the running_loss variable and divide this variable with the total samples in my dataset.

```
# Train the model
for epoch in range(epochs):
running_loss = 0.0
running_corrects = 0
running_total = 0
for i, (inputs, labels) in enumerate(train_dataloader):
inputs = inputs.to(device)
labels = labels.to(device)
optimizer.zero_grad()
with torch.amp.autocast(device_type="cuda", dtype=torch.float16):
outputs = new_model(inputs)
loss = criterion(outputs, labels)
scaler.scale(loss).backward() # Scale the gradients
scaler.step(optimizer) # Update the model parameters
scaler.update() # Update the scaler
running_loss += loss.item() * inputs.size(0)
_, predicted = torch.max(outputs.data, 1)
running_total += labels.size(0)
running_corrects += (predicted == labels).sum().item()
# Calculate the training loss and training accuracy
train_loss = running_loss / len(train_dataloader.dataset)
train_accuracy = 100 * running_corrects / running_total
```

I then do the same with validation loss

```
# evaluate on the validation set
correct = 0
total = 0
val_loss = 0.0
new_model.eval()
with torch.no_grad():
for data in valid_dataloader:
images, labels = data
images = images.to(device)
labels = labels.to(device)
outputs = new_model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
val_loss += criterion(outputs, labels).item() * labels.size(0)
# Calculate the validation accuracy and validation loss
val_accuracy = 100 * correct / total
val_loss /= len(valid_dataloader.dataset)
```