Use YOLO after train

I need to train YOLOv5 on my data and then use it in a program. I trained it from a tutorial on their github:

# download .zip from github
cd /yolo/yolov5/
!python train.py --batch 10 --epochs 40 --data ./data.yaml --cfg ./models/yolov5m.yaml --weights '' --name yolo_m --nosave --cache

I replaced the parameters in training.

nc: 7  # <- on yolov5m.yaml
parser.add_argument('--img-size', default=[504, 378], ...)  # <- on train.py

Then in ‘./runs/train/’ the script saved the models that were there during training. In the other script I take the best model and try to use it:

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weigths = torch.load('./runs/train/yolo_m-12/weights/best.pt')
model = weigths['model']
model = model.to(device)
_ = model.eval()

# terrible translation of the image into the input data :)
image = cv2.imread('image.jpg')  # 504x378 image
image = transforms.ToTensor()(image)
image = torch.tensor(np.array([img.numpy()]))
image = image.to(device)
image = image.half()

model(image)

And this causes the following error:

~/wf/env/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/yolo/yolov5/models/yolo.py in forward(self, x, augment, profile)
    121             return self.forward_augment(x)  # augmented inference, None
    122         else:
--> 123             return self.forward_once(x, profile)  # single-scale inference, train
    124 
    125     def forward_augment(self, x):

~/yolo/yolov5/models/yolo.py in forward_once(self, x, profile)
    152                 logger.info(f'{dt[-1]:10.2f} {o:10.2f} {m.np:10.0f}  {m.type}')
    153 
--> 154             x = m(x)  # run
    155             y.append(x if m.i in self.save else None)  # save output
    156 

~/wf/env/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/yolo/yolov5/models/common.py in forward(self, x)
    208 
    209     def forward(self, x):
--> 210         return torch.cat(x, self.d)
    211 
    212 

RuntimeError: Sizes of tensors must match except in dimension 2. Got 63 and 64 (The offending index is 0)

How do I fix the error or how do I load the model/weights correctly to make predictions normally?

To be honest, I’m wrong to ask it here and not in their github, but maybe someone has encountered this problem.

Usually this type of error is raised if you have some skip connections in the model (or anything similar where you are concatenating activations) and are using an “unsupported” spatial shape for the input.
“Unsupported” means that the model wasn’t designed to work with any arbitrary shape, but often well defined shapes (or at least shapes which are divisible by 2).
E.g. while inputs in the shape [batch_size, 3, 224, 224] might work, the concatenation could fail if you change it to [batch_size, 3, 227, 227] or so.
The error message also points to a mismatch of a single row/column (63 vs 64) so I would recommend to check the input shape and make sure it’s supposed to work.

1 Like

I figured out the problem and explained it on YOLO github, but I will duplicate it here for web-search.

Indeed, the error was in the mismatch of tensor size. I was feeding the input tensor (ToTensor()(image_like_nparray)). And if you enter a tensor, it should be by size (imgsz, imgsz) (512 for me), and my tensor was (504, 378), which led to the error.

And to make it all right, you need to load numpy.array, and then model will bring everything to the tensor and the required sizes by itself.

P.S. @ptrblck Thanks, you’re cool.

1 Like