Cross validating using different models

abdualhag · March 4, 2019, 8:04pm

Hi,

I am working on a classifier that classifies millions of images and I am having trouble cross-validating each image in the data set using different models. The code I have below works fine but is extremely inefficient. I tried to classify batches but could not get that to work with the cross-validation. The code below is what I have working right now. Any help to make this code run efficiently is most appreciated. To be clear, the models are trained on two classes, not 1000.

from __future__ import print_function 
from __future__ import division
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
from torch.autograd import Variable
from PIL import Image
import time
import os
import copy
from shutil import copyfile
import sys

#Classify only using GPU
def image_loader(loader, image_name):
    image = Image.open(image_name)
    image = image.convert('RGB')
    image = loader(image)
    image = image.unsqueeze(0)
    image = Variable(image)
    return image.cuda()
input_size = 224
data_transforms = transforms.Compose([
    transforms.Resize(input_size),
    transforms.CenterCrop(input_size),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Models to choose from [resnet, alexnet, vgg, squeezenet, densenet, inception]
events_path = './run'


for event in os.listdir(events_path):
    if event.endswith(".png"): 
        event_path = events_path + "/" + event
        pred_agrees = 1
        for i in range(6):
            if i == 0: model_name = "vgg"
            if i == 1: model_name = "alexnet"
            if i == 2: model_name = "resnet"
            if i == 3: model_name = "densenet"
            if i == 4: model_name = "squeezenet"   
            if i == 5:
                model_name = "inception"   
                input_size = 299
            model = torch.load("./" + model_name + ".pt")
            model.eval()
            
            prediction = model(image_loader(data_transforms, event_path)) 
            prediction = int(prediction.argmax())
            if (prediction == 1): 
                pred_agrees *= 2
        
    if (pred_agrees >= 64): 
        copyfile(event_path, events_path + '/payload/' + event)

Kushaj · March 4, 2019, 8:29pm

Don’t use image_loader function. Use pytorch torch.utils.data.DataLoader which is the standard for loading data in pytorch. And Variable is deprecated.
Refer to the pytorch data loading tutorial link