I am currently building a YOLO version 1, following the research paper. According to their model, the fully connected layer of YOLO version 1 is as follow:
In the end, the output would be a size (7x7x30) tensor, where each 30-parameter long array presents the information of an image. Thus, all the parameters are either 0 or +ve values.
However, the nn.linear() would yield negative values, which result in an error. Because the loss function contains square root calculation, a negative value would result in nan.
I would like to ask by what means can avoid receiving errors; at the same time, the model can still stick on the research paper?
Yes, that’s what I meant. Also, if you are regressing normalized center-form coordinates, maybe you could clamp them to be within [0, 1] to avoid negatives? Since the width/height of the box can’t be negative.
Though I still have some problems with my model… as it somehow outputs ‘nan’ after a few iterations. I have no idea why does it happen.
But the issue of negativity has been solved. Thank you very much.