I have a large training and validation dataset that does not fit inside the memory. Hence, I am training it in mini-batches using a data generator. I am not sure how to keep track of the losses so that I can plot a graph to see whether my training and validation loss is on the right track. I implemented a (pseudo)code as below. Is my approach correct?
validate_every = 500
training_loss_list = []
for i in number_of_epochs:
for batch_train in train_generator:
iter += 1
X_train, y_train = batch_train
...
output = model(X_train)
loss = criterion(output, y_train)
...
training_loss_list.append(loss.item())
if iter % validate_every == 0:
validation_loss_list = []
# Validation data will be different from everytime next(validation_generator) is called
for i in range(40):
X_val_batch, y_val_batch = next(validation_generator)
...
output = model(X_val_batch)
loss = criterion(output, y_val_batch).item()
validation_loss_list.append(loss)
training_loss = mean(training_loss_list)
validation_loss = mean(validation_loss_list)
print(f'Training loss: {training_loss} Validation loss: {validation_loss}')
training_loss_list = []