Problems with yolov5s from hub

Tulipian · February 6, 2023, 10:59pm

I hoped that I can use yolov5s from hub as plug in play in my standard code that works with pytorch resnet model, but somehow it does not work and I cannot figure out what’s the issue. I need to train a model first.

Here is the link

The standard training piece of code looks like

        for i, (imgs, annotations) in enumerate(data_loader):
            imgs = list(img.to(device) for img in imgs)
            annotations = [{k: v.to(device) for k, v in t.items()} for t in annotations]
            loss_dict = model(imgs, annotations)

If i use it with yolov5s

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', classes=num_classes, pretrained=False, autoshape=False)

It first fails because image tensors are in list, it seems it wants a single image tensor…

...
  File "C:\Users\Windows/.cache\torch\hub\ultralytics_yolov5_master\models\yolo.py", line 208, in forward
    return self._forward_augment(x)  # augmented inference, None
  File "C:\Users\Windows/.cache\torch\hub\ultralytics_yolov5_master\models\yolo.py", line 212, in _forward_augment
    img_size = x.shape[-2:]  # height, width
AttributeError: 'list' object has no attribute 'shape'

When I provide single image tensor and anno it now fails with

...
  File "C:\Users\Windows/.cache\torch\hub\ultralytics_yolov5_master\models\common.py", line 313, in forward
    return torch.cat(x, self.d)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 22 but got size 21 for tensor number 1 in the list.

Depending on a run the numbers are different with “expected size to be X, but got X-1”.
I cannot understand why this happens and how to fix it.

Any ideas where to look for?

shnmka · February 24, 2023, 3:01pm

I’ve been trying to do the same thing that you did and came across the same exact errors. I tried to to just pass the image into the model and it does give me 3 tensors as output. But that doesnt make sense because my model is in train mode and it should take in annotations along with image tensor. Let me know if you could find a workaround.