Memory problem on ensembling the several pretrained models

I’m trying to finetune the resnet model on Kaggle’s dogs and cats redux, and I successfully trained and test when all the train and validation process is seperated for each model. But, when I tried to load 4 of the resnet models at once( for ensemble), memory problem pop up and the kernel is gonna die. (both on GPU and CPU). Is there any beautiful way to ensemble CNN models without the increasing the physical memory? When I cut the models number from 4 to 2, all of the results seems to fine.

My computer Spec :
memory : 8g
graphic card : GTX1080ti 11G

My code

def return_output(model, img):
    output = model(img)
    return output    
def Ensemble_model_predict(test_loader, resnet18, resnet50, resnet101, resnet152, criterion):
    since = time.time()
    cuda = torch.cuda.is_available()
    if cuda:
        resnet18 = resnet18.cuda()
        resnet50 = resnet50.cuda()
        resnet101 = resnet101.cuda()
        resnet152 = resnet152.cuda()
    pbar = tqdm(enumerate(test_loader))
    total_correct,total_img,total_loss = 0.0 , 0.0 , 0.0
    for batch_idx,(img, label) in pbar:
        if cuda:
            img = img.cuda()
            label = label.cuda()
        output_18 = return_output(resnet18, img)
        output_50 = return_output(resnet50, img)
        output_101 = return_output(resnet101, img)
        output_152 = return_output(resnet152, img)
        output_ensemble = (output_18 + output_50 + output_101 + output_152)/4
        loss = criterion(output_ensemble, label)
        _, predicted = torch.max(,1)
        total_correct += (predicted ==
        total_img += label.size(0)
        total_loss += loss.item()
    total_loss = total_loss / total_img
    total_acc = total_correct/total_img
    print("total {} image processed, correct predictes :{}, accuracy{}".format(total_img, total_correct, total_acc))
    print("loss : {}".format(total_loss))
    since = time.time() - since
    print("Test Completed in {}:{}".format(since//60 , since%60))

criterion = nn.CrossEntropyLoss()
Ensemble_model_predict(test_loader, resnet18, resnet50, resnet101, resnet152, criterion)

You could load the Models one by one, save all predictions to your disc, and ensemble separately.

In that way you are not memory bound and you could even create many more different models.

For consideration on my condition, your suggestion perhaps the only choice I can choose. Thank you for reply.