Pytorch training and testing loop confusion

Usman1 · November 12, 2024, 7:17am

Hi everyone, I’m working on a deep learning model where I need to predict (X, Y) coordinates . I’ve encountered an issue I’ve been advised to plot the predicted positions at each epoch, but it’s only necessary to generate this plot once at the end of the training. Could someone guide me on how to best structure this so that i can move forward with.my traing loop is as follow:

total_train_losses =
total_test_losses =
all_predicted_outputs =
all_actual_outputs =

Training and evaluation process

for epoch in range(num_epochs):
# --------Training---------
model.train()
running_train_loss = 0.0 # Initialize running training loss for each epoch

for batch_data, batch_target in train_loader:
    
    optimizer.zero_grad()

    # Forward pass
    predicted_output = model(batch_data)  # Predict X, Y coordinates for the batch

    # Compute loss
    train_loss = criterion(predicted_output, batch_target)

    # Backward pass and optimization
    train_loss.backward()
    optimizer.step()

    running_train_loss += train_loss.item()

# Compute average training loss
avg_train_loss = running_train_loss / len(train_loader)
# Compute average test loss
    
total_train_losses.append(avg_train_loss)  # Append to total train losses
# --------Evaluation---------
model.eval()
running_test_loss = 0.0
predicted_outputs = []
actual_outputs = []
with torch.no_grad():
    

    for batch_data, batch_target in test_loader:
        #print("Batch Data (Input):", batch_data)
        #print("Batch Target (Label):", batch_target)
        #break  # Only print the first batch to avoid flooding output
        predicted_output = model(batch_data)
        test_loss = criterion(predicted_output, batch_target)
        running_test_loss += test_loss.item()

        # Store the predicted and actual outputs for analysis
        predicted_outputs.append(predicted_output.cpu().numpy())
        actual_outputs.append(batch_target.cpu().numpy())

    # Compute average test loss
    avg_test_loss = running_test_loss / len(test_loader)
    
    total_test_losses.append(avg_test_loss)  # Append to total train losses
# Print train and test loss every 100 epochs
if (epoch + 1) % 100 == 0:
    print(f"Epoch {epoch+1}/{num_epochs}, Train Loss: {avg_train_loss:.4f}, Test Loss: {avg_test_loss:.4f}")

ptrblck · November 13, 2024, 7:32pm

I don’t fully understand your question and where exactly you are stuck.
In your current code snippet you are already appending the predicted coordinates during the validation run to predicted_outputs and should be able to plot these afterwards.

Usman1 · November 30, 2024, 10:34pm

sorry for confusion.
in my training and testing loop i am predicted positions (predicted output for training)and predicted output for testing.now i need to calculate the average position error.so for that i will calculate eucludean distance between predicted output and ground truth values(for both training and testing).i am stuck will I use these actual and predicted positions
predicted_outputs.append(predicted_output.cpu().numpy())
actual_outputs.append(batch_target.cpu().numpy())
or do i need to take the the shuffled data(train target and test target) i load to train_test loader?
hope this clear you my confusion

ptrblck · December 1, 2024, 5:35pm

You can use the appended batch output and target for the loss or accuracy calculation.