Loss function with small amount of positives

I have a binary classification model. My data is having 1 million samples, and only 100 of the are tagged with 1 (rest tagged with 0).
My model now is a simple fully-connected.
I’m having some problems with finding a loss function that will be able to overcome the small amount of true positives.
I saw BCEWithLogitsLoss, but had some problems with it, so I used the next loss function I run into (which actually makes the same, I think) :

def weighted_binary_cross_entropy(output, target, weights=None):
output = torch.clamp(output,min=1e-8,max=1-1e-8)

if weights is not None:
    assert len(weights) == 2
    loss = weights[1] * (target * torch.log(output)) + \
           weights[0] * ((1 - target) * torch.log(1 - output))
    loss = target * torch.log(output) + (1 - target) * torch.log(1 - output)

return torch.neg(torch.mean(loss))

My problem is - what values do I put in the weights?
I tried to put 1 to the 0 tag andfor the 1 tag : (len(train)-len(1-tagged))/len(1-tagged)

But the model seems to not learn.

Is this way not good?
Or maybe the weight are incorrect?
Thanks for the helpers!