Deploy pytorch model on webcam

I am trying to deploy PyTorch classifier on webcam, but always getting errors, mostly “AttributeError: ‘collections.OrderedDict’ object has no attribute ‘load_state_dict’”. The classifier is a binary classifier. Saved the model as .pt file. Hope for your support to resolve the issue. Here are the codes I am using ( Collected from this forum) :

import numpy as np  
import torch
import torch.nn
import torchvision 
from torch.autograd import Variable
from torchvision import transforms
import PIL 
import cv2
#This is the Label
Labels = { 0 : 'Perfect',
           1 : 'Defected'
        }
# Let's preprocess the inputted frame
data_transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize(size=(224, 224)),
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")   ##Assigning the Device which will do the calculation
model  = torch.load("defect_classifier.pt") #Load model to CPU
model.load_state_dict(torch.load("defect_classifier.pt"))
model  = model.to(device)   #set where to run the model and matrix calculation
model.eval()                #set the device to eval() mode for testing
#Set the Webcam 
def Webcam_720p():
    cap.set(3,1280)
    cap.set(4,720)
def argmax(prediction):
    prediction = prediction.cpu()
    prediction = prediction.detach().numpy()
    top_1 = np.argmax(prediction, axis=1)
    score = np.amax(prediction)
    score = '{:6f}'.format(score)
    prediction = top_1[0]
    result = Labels[prediction]
    return result,score
def preprocess(image):
    image = PIL.Image.fromarray(image) #Webcam frames are numpy array format
                                       #Therefore transform back to PIL image
    print(image)                             
    image = data_transforms(image)
    image = image.float()
    #image = Variable(image, requires_autograd=True)
    image = image.cuda()
    image = image.unsqueeze(0) #I don't know for sure but Resnet-50 model seems to only
                               #accpets 4-D Vector Tensor so we need to squeeze another
    return image                            #dimension out of our 3-D vector Tensor
    
    
#Let's start the real-time classification process!
                                  
cap = cv2.VideoCapture(0) #Set the webcam
Webcam_720p()
fps = 0
show_score = 0
show_res = 'Nothing'
sequence = 0
while True:
    ret, frame = cap.read() #Capture each frame
    
    
    if fps == 4:
        image        = frame[100:450,150:570]
        image_data   = preprocess(image)
        print(image_data)
        prediction   = model(image_data)
        result,score = argmax(prediction)
        fps = 0
        if result >= 0.5:
            show_res  = result
            show_score= score
        else:
            show_res   = "Nothing"
            show_score = score
        
    fps += 1
    cv2.putText(frame, '%s' %(show_res),(950,250), cv2.FONT_HERSHEY_SIMPLEX, 2, (255,255,255), 3)
    cv2.putText(frame, '(score = %.5f)' %(show_score), (950,300), cv2.FONT_HERSHEY_SIMPLEX, 1,(255,255,255),2)
    cv2.rectangle(frame,(400,150),(900,550), (250,0,0), 2)
    cv2.imshow("ASL SIGN DETECTER", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyWindow("ASL SIGN DETECTER")

When you saved the model did you save the state dict or the whole model. Because you are trying to load both which you cannot do.

I saved the model like this: torch.save(resnet18.state_dict(), ‘defect_classifier.pt’)

Ok then you still need to define the model in this file and then do load state dict. The model load function won’t work because you didn’t save the entire model.

Thank you @Dwight_Foster
How to do that? I used transfer learning.

How did you load in your model in the original training script? Like where did you define it. Because you can just do that again and then after do the load state dict

@Dwight_Foster Thanks a lot. it has worked. I did like:

Load the model and set in eval

model = torchvision.models.resnet18(pretrained=True)
model.fc = torch.nn.Linear(in_features=512, out_features=2)

model.load_state_dict(torch.load(‘defect_classifier.pt’))
model.eval()

Now getting AssertionError: Torch not compiled with CUDA enabled.
Trying this: ```
pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Does your pytorch version not have cuda but your computer does?

@Dwight_Foster Not sure. I am using MAC M.
Ran ```
pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Still getting the same error.

You are using a mac mini? I’m pretty sure you cannot use cuda on that so you want to install the cpu version of it.

I am using MACBOOK Pro M1.

Looking in links: https://download.pytorch.org/whl/torch_stable.html
Requirement already satisfied: torch in /Users/mdrahman/opt/anaconda3/lib/python3.8/site-packages (1.7.1)
Requirement already satisfied: typing-extensions in /Users/mdrahman/opt/anaconda3/lib/python3.8/site-packages (from torch) (3.7.4.3)
Requirement already satisfied: numpy in /Users/mdrahman/opt/anaconda3/lib/python3.8/site-packages (from torch) (1.18.2)

I tried with: device = torch.device(“cpu”)
But no luck.

Ok I didn’t think you could use pytorch on m1.

torch.cuda.is_available() is returning False.

Yes that is because cuda is not available on mac. Are you using cuda on any part of your script.

No, I did not use CUDA

where did you train the model?

Thank you @Dwight_Foster , I fixed the issue. The webcam is opening but freezing in a second.
Also getting a new error:

<PIL.Image.Image image mode=RGB size=420x350 at 0x2875F135700>
tensor([[[[ 0.5707, 0.5707, 0.6221, …, 0.1939, 0.2111, 0.2111],
[ 0.5878, 0.5707, 0.6049, …, 0.1939, 0.2111, 0.2111],
[ 0.6049, 0.5707, 0.5878, …, 0.1939, 0.2111, 0.2111],
…,
[-1.4843, -1.4843, -1.5014, …, 0.3309, 0.3481, 0.3309],
[-1.4843, -1.4672, -1.3473, …, 0.3138, 0.3309, 0.3138],
[-1.4672, -1.4672, -1.3302, …, 0.3138, 0.3309, 0.3309]],

     [[ 0.6604,  0.6604,  0.7129,  ...,  0.1877,  0.2052,  0.2052],
      [ 0.6779,  0.6604,  0.6954,  ...,  0.1877,  0.2052,  0.2052],
      [ 0.6954,  0.6604,  0.6779,  ...,  0.1877,  0.2052,  0.2052],
      ...,
      [-1.5280, -1.5280, -1.5455,  ...,  0.5028,  0.5203,  0.5028],
      [-1.5280, -1.4930, -1.3880,  ...,  0.4853,  0.5028,  0.4853],
      [-1.4930, -1.4930, -1.3529,  ...,  0.4853,  0.5028,  0.5028]],

     [[ 0.9494,  0.9494,  1.0017,  ...,  0.2871,  0.3045,  0.3045],
      [ 0.9668,  0.9494,  0.9842,  ...,  0.2871,  0.3045,  0.3045],
      [ 0.9842,  0.9494,  0.9668,  ...,  0.2871,  0.3045,  0.3045],
      ...,
      [-1.2467, -1.2467, -1.2641,  ...,  0.7576,  0.7751,  0.7576],
      [-1.2467, -1.2293, -1.1247,  ...,  0.7402,  0.7576,  0.7402],
      [-1.2293, -1.2467, -1.1247,  ...,  0.7402,  0.7576,  0.7576]]]])

TypeError Traceback (most recent call last)
in
87 result,score = argmax(prediction)
88 fps = 0
—> 89 if result >= 0.5:
90 show_res = result
91 show_score= score

TypeError: ‘>=’ not supported between instances of ‘str’ and ‘float’

Actually looking at your code what is the if statement doing. Are you trying to make sure the model is confident in its prediction. Because then you should use score instead of result.

1 Like

It suppose to say Perfect or Defected