Hello I’m training the fasterrcnn50 model and needed some help with some of the terms that are output during the training and evaluation phase. These values are from sending a single batch of size 3 to the model.train() and model.eval() just wanted to know what the outputs of the model looked like
Training
What do the terms,loss_box_reg,loss_classifier,loss_objectness and loss_rpn_box_reg mean.
{'loss_box_reg': tensor(0.1667, device='cuda:0', grad_fn=<DivBackward0>),
'loss_classifier': tensor(0.1150, device='cuda:0', grad_fn=<NllLossBackward0>),
'loss_objectness': tensor(0.0133, device='cuda:0',
grad_fn=<BinaryCrossEntropyWithLogitsBackward0>),
'loss_rpn_box_reg': tensor(0.0077, device='cuda:0', grad_fn=<DivBackward0>)}
Evaluation
Why are my boxes, labels, and scores so big? my boxes should only contain x1,x2,y1,y2 labels should only be 1 of three options [“truck”,“car”,“jeep”], [0,1,2] respectively. And shouldn’t score be a tensor of size one with some sort of score.
[{'boxes': tensor([[ 180.0518, 495.6577, 1024.0000, 908.9251],
[ 185.0601, 103.9851, 1024.0000, 518.3134],
[ 182.5600, 338.2882, 1024.0000, 752.6675],
[ 471.0854, 211.9680, 1024.0000, 1003.5166],
[ 83.2046, 226.3543, 810.2725, 1024.0000],
[ 0.0000, 582.8847, 895.5603, 989.6712],
[ 651.6221, 118.1111, 1024.0000, 1024.0000],
[ 229.5882, 396.0160, 1024.0000, 846.7419],
[ 233.9064, 82.7686, 1024.0000, 534.3818],
[ 232.9609, 238.4104, 1024.0000, 691.0627],
[ 231.0070, 548.6355, 1024.0000, 1007.2180],
[ 123.2340, 173.6318, 767.1678, 1024.0000],
[ 0.0000, 311.9489, 517.6907, 1024.0000],
[ 0.0000, 473.5332, 904.4132, 928.5795],
[ 0.0000, 242.5558, 473.8981, 1024.0000],
[ 0.0000, 82.2973, 707.3653, 722.0305],
[ 347.9162, 41.3262, 1008.5706, 907.2560]], device='cuda:0',
grad_fn=<StackBackward0>),
'labels': tensor([1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2], device='cuda:0'),
'scores': tensor([0.2119, 0.2107, 0.2102, 0.1983, 0.1857, 0.1548, 0.1534, 0.1534, 0.1521,
0.1512, 0.1456, 0.1445, 0.1405, 0.1396, 0.1107, 0.0851, 0.0630],
device='cuda:0', grad_fn=<IndexBackward0>)},
{'boxes': tensor([[ 344.3575, 96.7017, 1024.0000, 930.0717],
[ 95.6406, 417.0934, 1024.0000, 830.2409],
[ 96.3330, 260.8725, 1024.0000, 673.8717],
[ 0.0000, 110.8532, 872.5706, 519.8826],
[ 96.9932, 576.4453, 1024.0000, 988.0748],
[ 0.0000, 250.0097, 702.8840, 1024.0000],
[ 382.6677, 54.5672, 1024.0000, 962.0132],
[ 148.8957, 396.1225, 1024.0000, 846.1454],
[ 149.7371, 239.8830, 1024.0000, 689.7913],
[ 150.1571, 549.3901, 1024.0000, 1007.3665],
[ 200.0044, 173.0603, 845.9658, 1024.0000],
[ 0.0000, 315.7430, 822.4493, 771.3001],
[ 0.0000, 82.2462, 823.9964, 536.9340],
[ 0.0000, 405.8919, 724.5031, 1024.0000],
[ 0.0000, 93.0665, 466.7580, 1024.0000]], device='cuda:0',
grad_fn=<StackBackward0>),
'labels': tensor([1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2], device='cuda:0'),
'scores': tensor([0.2142, 0.2123, 0.2114, 0.1902, 0.1848, 0.1574, 0.1561, 0.1537, 0.1531,
0.1462, 0.1454, 0.1438, 0.1429, 0.1197, 0.1187], device='cuda:0',
grad_fn=<IndexBackward0>)},
{'boxes': tensor([[ 183.7025, 415.5847, 1024.0000, 830.7776],
[ 187.0761, 257.8963, 1024.0000, 672.8000],
[ 427.8109, 137.0997, 1024.0000, 1024.0000],
[ 169.4555, 228.0593, 885.1265, 1024.0000],
[ 20.9872, 575.1481, 1024.0000, 989.0760],
[ 0.0000, 188.6652, 869.6748, 598.1370],
[ 459.9109, 84.2036, 1024.0000, 1024.0000],
[ 121.1884, 408.6773, 1024.0000, 975.6856],
[ 235.6789, 314.0932, 1024.0000, 768.0829],
[ 147.8334, 74.9042, 1024.0000, 633.0103],
[ 0.0000, 322.5314, 505.9696, 1024.0000],
[ 0.0000, 185.5542, 643.5510, 1024.0000],
[ 0.0000, 33.5092, 413.7768, 889.9094]], device='cuda:0',
grad_fn=<StackBackward0>),
'labels': tensor([1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 2, 2], device='cuda:0'),
'scores': tensor([0.2143, 0.2107, 0.2017, 0.1900, 0.1874, 0.1861, 0.1593, 0.1576, 0.1543,
0.1518, 0.1443, 0.1272, 0.0540], device='cuda:0',
grad_fn=<IndexBackward0>)}]