Detic, ValueError from swintransformer.py

I am following the Detectron/ Detic tutorial in order to auto crop a large training dataset from the Kaggle Whale and dolphin competition.

The config cfg, vocabulary etc. are the same as in the tutorial. I have asked a different question about my code. The solution to that specific problem, seems to be pasting an exact copy of the built-in collate_fn function into the notebook(!)

However, I have not been able to run a complete evalutation/ forward run of the dataset yet. The model.eval() procedure runs for at least 30 minutes on a Tesla T4, then stops with the traceback presented below. Any ideas why?

Trying

image =Image.open('/home/jupyter/happy-whale-and-dolphin/train_images/00021adfb725ed.jpg')
image.mode

returns ‘RGB’.

Here is a snippet of my code:

def my_dataset_generator():
    data_list = []
    for index, observation in df.iterrows():
        data_dict = {}
        data_dict["file_name"] = observation['file_path']
        data_dict['image_id'] = observation['image']
        data_list.append(data_dict)
    data_list = np.array(data_list)
    print(data_list[0])
    return data_list

DatasetCatalog.register("my_whales5", my_dataset_generator)
data = DatasetCatalog.get('my_whales5')
#print(type(data), data[0])
model = build_model(cfg)  # returns a torch.nn.Module
test_loader = build_detection_test_loader(cfg, 
                                          'my_whales5', 
                                          batch_size=32,
                                         collate_fn = custom_collate,
                                         mapper=DatasetMapper(cfg, augmentations=[T.Resize((90, 90))]))

model.eval()
with torch.no_grad():
    outputs = model(test_loader)

which yields

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_13624/1901225259.py in <module>
      2 model.eval()
      3 with torch.no_grad():
----> 4     outputs = model(test_loader)
      5 
      6 print('Stopped running at: {}'.format(datetime.now().strftime('%H:%M')))

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/Detic/detic/modeling/meta_arch/custom_rcnn.py in forward(self, batched_inputs)
    113         """
    114         if not self.training:
--> 115             return self.inference(batched_inputs)
    116 
    117         images = self.preprocess_image(batched_inputs)

~/Detic/detic/modeling/meta_arch/custom_rcnn.py in inference(self, batched_inputs, detected_instances, do_postprocess)
     95 
     96         images = self.preprocess_image(batched_inputs)
---> 97         features = self.backbone(images.tensor)
     98         proposals, _ = self.proposal_generator(images, features, None)
     99         results, _ = self.roi_heads(images, features, proposals)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/detectron2/modeling/backbone/fpn.py in forward(self, x)
    124                 ["p2", "p3", ..., "p6"].
    125         """
--> 126         bottom_up_features = self.bottom_up(x)
    127         results = []
    128         prev_features = self.lateral_convs[0](bottom_up_features[self.in_features[-1]])

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/Detic/detic/modeling/backbone/swintransformer.py in forward(self, x)
    602     def forward(self, x):
    603         """Forward function."""
--> 604         x = self.patch_embed(x)
    605 
    606         Wh, Ww = x.size(2), x.size(3)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/Detic/detic/modeling/backbone/swintransformer.py in forward(self, x)
    427         """Forward function."""
    428         # padding
--> 429         _, _, H, W = x.size()
    430         if W % self.patch_size[1] != 0:
    431             x = F.pad(x, (0, self.patch_size[1] - W % self.patch_size[1]))

ValueError: too many values to unpack (expected 4)

It seems this internal method:

_, _, H, W = x.size()

expects x to have 4 dimensions, which isn’t the case in your script.
Could you check which input shape is expected and what you are currently passing to the model?

1 Like

Thanks, I found that the dimension of the tensor given to the Detic machinery is 5, namely [<Number of batches>, <batch size>, <number of channels>, <height>, <width>]

I solved the specific error by changing

_, _, H, W = x.size()

to

try:
  H = x.size()[-2]
  W = x.size()[-1]
except ValueError:
  logging.info('Crash: {}'.format(x.size()))

That fixed the issue, but gave this traceback later in the evaluation:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 3, 4, 4], but got 5-dimensional input of size [3189, 16, 3, 96, 96] instead

As mentioned, I have copypasted the default_collate function into my notebook, and renamed it custom_collate. Since I am in a cloud Jupyter environment, my debugging options are limited. Without being able to step through, I find the default_collate function a bit dense for reverse engineering.

Any clues to where I should modify my code, in order to strip the fifth dimension of my tensors being passed to Detic?