I am following the Finetuning Instance Segmentation tutorial with my own data.
I am at the point in the tutorial where the model is trained for 10 epochs. All the code up to this point seems to work. When I run this section, the results are mixed. Once, it ran for one epoch and then gave an error. Then I tried it again and it gave the same error as previous attempt but before finishing one epoch. Now it ran for two epochs and it returns the error again. In the first two failures this line had “…process 0”. Now in the third attempt:
IndexError: Caught IndexError in DataLoader worker process 1.
In all three attempts, the following line is always the same:
IndexError: too many indices for tensor of dimension 1
This is the stacktrace:
Epoch: [0] [ 0/19] eta: 0:13:06 lr: 0.000282 loss: 8.1070 (8.1070) loss_classifier: 0.4619 (0.4619) loss_box_reg: 0.3113 (0.3113) loss_mask: 7.3135 (7.3135) loss_objectness: 0.0185 (0.0185) loss_rpn_box_reg: 0.0019 (0.0019) time: 41.3855 data: 3.8544 max mem: 0
Epoch: [0] [10/19] eta: 0:04:21 lr: 0.003058 loss: 1.5146 (2.5890) loss_classifier: 0.3459 (0.3226) loss_box_reg: 0.2827 (0.2941) loss_mask: 0.7114 (1.8825) loss_objectness: 0.0403 (0.0862) loss_rpn_box_reg: 0.0034 (0.0036) time: 29.0771 data: 0.3538 max mem: 0
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-15-973f272f5759> in <module>()
4 for epoch in range(num_epochs):
5 # train for one epoch, printing every 10 iterations
----> 6 train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=10)
7 # update the learning rate
8 lr_scheduler.step()
5 frames
/content/engine.py in train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq)
24 lr_scheduler = utils.warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)
25
---> 26 for images, targets in metric_logger.log_every(data_loader, print_freq, header):
27 images = list(image.to(device) for image in images)
28 targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
/content/utils.py in log_every(self, iterable, print_freq, header)
199 ])
200 MB = 1024.0 * 1024.0
--> 201 for obj in iterable:
202 data_time.update(time.time() - end)
203 yield obj
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in __next__(self)
515 if self._sampler_iter is None:
516 self._reset()
--> 517 data = self._next_data()
518 self._num_yielded += 1
519 if self._dataset_kind == _DatasetKind.Iterable and \
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _next_data(self)
1197 else:
1198 del self._task_info[idx]
-> 1199 return self._process_data(data)
1200
1201 def _try_put_index(self):
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _process_data(self, data)
1223 self._try_put_index()
1224 if isinstance(data, ExceptionWrapper):
-> 1225 data.reraise()
1226 return data
1227
/usr/local/lib/python3.7/dist-packages/torch/_utils.py in reraise(self)
427 # have message field
428 raise self.exc_type(message=msg)
--> 429 raise self.exc_type(msg)
430
431
IndexError: Caught IndexError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataset.py", line 330, in __getitem__
return self.dataset[self.indices[idx]]
File "<ipython-input-1-4d5623c80b35>", line 62, in __getitem__
area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
IndexError: too many indices for tensor of dimension 1
This is not my image exactly, I can’t share, but this is close.
And this is one of my masks.