Avoid negative output from yolo model

hyman9090 · March 20, 2022, 11:16am

I am currently building a YOLO version 1, following the research paper. According to their model, the fully connected layer of YOLO version 1 is as follow:

return nn.Sequential(
nn.Flatten(),
nn.Linear(1024*S*S, 4096),
nn.Dropout(),
nn.LeakyReLU(0.1),
nn.Linear(4096, S*S*(C+B*5) )
)

In the end, the output would be a size (7x7x30) tensor, where each 30-parameter long array presents the information of an image. Thus, all the parameters are either 0 or +ve values.

However, the nn.linear() would yield negative values, which result in an error. Because the loss function contains square root calculation, a negative value would result in nan.

I would like to ask by what means can avoid receiving errors; at the same time, the model can still stick on the research paper?

Akbar_Shah · March 20, 2022, 11:36pm

I have seen people using a sigmoid activation function after the final layer to avoid negative predictions.

hyman9090 · March 21, 2022, 2:16pm

@Akbar_Shah Thanks for your reply. Do you mean to append sigmoid activation function after nn.Linear()?

As the research paper didn’t mention the sigmoid activation function, so I didn’t apply it to the network. Because I don’t know if doing so is legit?

Akbar_Shah · March 21, 2022, 2:51pm

Yes, that’s what I meant. Also, if you are regressing normalized center-form coordinates, maybe you could clamp them to be within [0, 1] to avoid negatives? Since the width/height of the box can’t be negative.

hyman9090 · March 21, 2022, 6:48pm

Though I still have some problems with my model… as it somehow outputs ‘nan’ after a few iterations. I have no idea why does it happen.
But the issue of negativity has been solved. Thank you very much.