Hi, the doc from fasterRCNN said:
boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format,
with values of x between 0 and W and values of y between 0 and H
In this case, the x1, y1, x2, y2 correspond to the left low corner and right top corner cordinates?
But can not fulfil the 0-W and 0-H range.
My question is, what is x1, y1, x2, y2 positions?
Thank you
(x1, y1) is the top-left corner and (x2,y2) is the bottom-right corner. I am not sure why they would not fulfill the 0-W and 0-H range. They are the width and height of the image right? Surely the points cannot go outside the image ?
emmm like a rectangle
x1: 100 y1: 200
x2: 300 y2: 400
so the w: 200, h: 200
and x2, y2 is already out of 0-200 range.
You image shape is (200, 200) ? I think you are confusing image height and width with the rectangle height and width ?
that’s an example… another example:
x1: 100, y1: 100
x2: 400, y2: 300
h: 200, w: 300
x2 and y2 also exceed 0-200
The H and W are height and width of the image NOT the rectangle. How can the corners be outside the image ?
Please read the doc of fasterrcnn
Implements Faster R-CNN.
The input to the model is expected to be a list of tensors, each of shape [C, H, W], one for each
image, and should be in 0-1 range. Different images can have different sizes.
The behavior of the model changes depending if it is in training or evaluation mode.
During training, the model expects both the input tensors, as well as a targets (list of dictionary),
containing:
- boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with values of x
between 0 and W and values of y between 0 and H
- labels (Int64Tensor[N]): the class label for each ground-truth box
In this example
x1: 100, y1: 100
x2: 400, y2: 300
h: 200, w: 300
surely height of image is greater than 300 and width greater than 400, otherwise how will the corners be outside.
yes. i’m saying that. in the doc. they said x and y are between 0-w and 0-h.
But the value in my example of x and y already bigger than 200 (Height)
SO the definition in doc is wrong? Or the x1, y1, x2, y2 values are defiend wrongly in my example?
The H, W is the image height and width, not the h w of the box. Thank you @user_123454321