Problem Image classification or object detection task with yolo when model has good predictive ability on test set with transform.normalize but not working on video camera

  • After training my CNN or Resnet50 cutom, I have good predictive performance on test set when I split my dataset into training/val/test set. However, I also want to get the same exact prediction on camera video or on ligne but I have poor prediction ability, all objects are unclassifiable. When I test my transform for each frame of the video or small image of each frame, I get a black image or an image that I don’t want.

  • If I don’t use transforms.normalize to augment the data, I’ve got instability during training, which is very difficult to fix. If I test this trained model on camera, I get the same result, poor predictability.

  • I don’t know how to deal with this problem, but if the image dataset and the real camera image are very different, does my model have a bad prediction, it’s normal? or How to get mean and std on my dataset is good for camera image?

  • My transforms for data augmentation:

train_transform = transforms.Compose([
                                    transforms.Resize((img_size, img_size)),
                                    transforms.Pad(padding=30),
                                    transforms.GaussianBlur(kernel_size=(5,9),sigma=(0.1,0.5)),
                                    transforms.RandomHorizontalFlip(p=0.5),
                                    transforms.RandomVerticalFlip(p=0.5),
                                    transforms.RandomPerspective(p=0.5),
                                    transforms.RandomAffine(degrees=(30,70)),
                                    transforms.ToTensor(),
                                    transforms.Normalize(
                                        mean=[0.58067408, 0.5785063,  0.5922596 ],
                                        std=[0.33043968, 0.33169742, 0.33235405])])
  • My transform for test model:
test_transform = transforms.Compose([
                                    transforms.Resize((img_size, img_size)),
                                    transforms.ToTensor(),
                                    transforms.Normalize(
                                        mean=[0.58067408, 0.5785063,  0.5922596 ],
                                        std=[0.33043968, 0.33169742, 0.33235405])])