GPU cached memory increase when training different models in a loop

Hello all,
I’m training several models one after the other and my GPU cached memory is being increased during training of the second model.
Note that when using the same code but train only one model with many epochs this doesnt happen.
So it somehow related to replacing trained model in the GPU.
For simplicity I used the same architecture, optimizer, etc for all trained model.

My code for is:

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
for model_name, training_obj in all_models.items():
    num_epochs = len(training_obj['lr_array'])
    log_message('--------------Training model: ' + model_name + ' -------------------------------')
    optimizer = training_obj['optimizer']
    train_loader = training_obj['train_loader']
    criterion = training_obj['criterion']
    model = training_obj.get('model')
    model.to(device)
    model.train()
    epochs_to_run = range(num_epochs)
    # Loop over epochs to run and train the model
    for epoch in epochs_to_run:
        torch.cuda.empty_cache()
        # Loop over mini batches
        for i_batch, (images, labels) in enumerate(train_loader):
            images = images.to(device)
            labels = labels.to(device)
            # Forward pass
            outputs = model(images).squeeze()
            optimizer.zero_grad()
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

Found the problem.
I had to delete objects of models after training so that GC can clean them before moving to next training next model.