Hello everyone. I am working on retrieval problem where I need to retrieve a key response K from a pool of 100 candidates based on a given query Q. The output of my network is a tensor of scores that represents the matching between Q and Ki, for each Ki in the pool.

My ground truth is a tensor filled with 0’s except for a single entry that contains 1 representing the correct K to retrieve. (e.g., [1, 0, 0, 0, …, 0] means K1 is the correct response to be retrieved).

To compute the loss I am using the BCE but I was wondering since I have the constraints of always having at least a 1 in my results (so the configuration with all 0’s is not accepted), is there any better loss for this type of task?

I ask you this because the final tensor will be a list of 100 scores and 99 of them will be forced to be 0, so the model will try to predict all 0’s to minimize the loss. Is `pos_weight`

the only thing I can try?

Thank you