ValueError: Expected target boxes to be a tensorof shape [N, 4], got torch.Size([4])

Hi, Doing object detection with F-RCNN.

Faster RCNN requires boxes to be a float tensor of shape [N,4].

Since pandas dataframe is very slow i decided to put all different columns in list like that:

subxmin=[xmins for xmins in df['xmin']]
subymin=[ymins for ymins in df['ymin']]
subxmax=[xmaxs for xmaxs in df['xmax']]
subymax=[ymaxs for ymaxs in df['ymax']]
train_images=[image for image in df['image']]
bboxes=np.array([list(bbox) for bbox in zip(subxmin, subymin, subxmax,subymax)])

The dataset is :

class CarData(
  def __init__(self,dir,images,boxes,df):

  def __len__(self):
    return len(self.images)

  def __getitem__(self,index):
    id_for_label=self.df[self.df["image"] == self.images[index]]

    area = boxes[:][2] * boxes[:][3]
    area = torch.as_tensor(area, dtype=torch.float32)

    boxes[:][2] = boxes[:][0] + boxes[:] [2]
    boxes[:][3] = boxes[:][1] + boxes[:][3]

    labels = torch.ones((id_for_label.shape[0]), dtype=torch.int64)
    target = {}
    target["boxes"] = torch.tensor(boxes)
    target["labels"] = torch.tensor(labels)
    target["image_id"] = torch.tensor([index])
    target["area"] = area

    return img,target

The training function :

def train(model,optim,dataloader,path,num_of_epochs):

  for epoch in range(num_of_epochs):
    for images,targets in dataloader:
      images=[ for im in images]
      targets = [{k: for k, v in t.items()} for t in targets]
      loss_dict = model(images, targets)
      losses = sum(loss for loss in loss_dict.values())
      losses_value = losses.item()


Error i’m getting is :

ValueError: Expected target boxes to be a tensorof shape [N, 4], got torch.Size([4]).

Any solutions?

It seems your boxes tensor is missing dim0. If you are dealing with a single box, you can use boxes = boxes.unsqueeze(0) to add this dimension.

Thank you for your reply.

In dataset, yes i use single box but while training there are batch of boxes.

When i do that unsqueeze there is another error of :

RuntimeError: The size of tensor a (380) must match the size of tensor b (3) at non-singleton dimension 0

I think there must be other solution.

I will explain briefly what i want to do:

Faster RCNN requires boxes to be [N,4]. i don’t know really what N represents.

  • boxes (FloatTensor[N, 4]): the coordinates of the N bounding boxes in [x0, y0, x1, y1] format, ranging from 0 to W and 0 to H
    this is official Explanation of that but still…

if i do same with Dataframe it works. but i don’t want to train it with pandas. because pandas is slow.

N is referring to the number of boxes, so if you are dealing with 10 boxes the shape would be [10, 4].
The RuntimeError you get during unsqueeze is a bit strange, as the single bounding box should contain 4 coordinates only. Based on the error message it seems you are trying to cat/stack tensors where one has a shape of 380 in dim0 while the other has 3.

1 Like

Yes, but Still could not find solution.

not sure how you did the settings for the Dataloader. In my case, I have the same error when not setting the collate_fn=utils.collate_fn. See the tutorial example link and how it declares

# define training and validation data loaders
data_loader =
    dataset, batch_size=2, shuffle=True, num_workers=4,

Yes, That was one of errors. but there were other errors too which i have already solved. Thanks

Hello, could you please write how you solved it?
I’m having the same problem.

Hi, Lane_Lines_and_Car_Detection/Faster Rcnn at main · TornikeAm/Lane_Lines_and_Car_Detection · GitHub