I am trying the Object Detection Finetuing tutorial, which is very nice, smooth and helpful. I think there is a little bug in the
labels, as they should mimic "
labels (Int64Tensor[N]) : the label for each bounding box", or more plausibly, "
labels (Int64Tensor[N]) : the label for each object". Clearly, the code works well with the Fudan dataset as it only has one object, ie person. If I am correct, then
labels = torch.ones((num_objs,), dtype=torch.int64)
should be replaced with the following:
labels = torch.as_tensor(obj_ids, dtype=torch.int64)