How does the multinomial sampling work?

I am dealing an imbalanced dataset, say #(postive) = 1K and #(negative) = 50K.
Then I find a library imbalanced-dataset-sampler(which is based on torch.multinomial) to resample to reduce the skewness.

For the basic usage, it pass and array of data weight to torch.multinomial then return the sampled indices(with replacement).


# weight for each data point, 2e-5 = 1/#(negative), 1e-3 = 1/#(positive)
weights = [2e-5, 1e-3, 2e-5, ...] 
sampled_indices = torch.multinormail(
                           num_samples = (num_pos+num_neg),
                           replacement = True

However, when I query original dataframe with this sampled indies, the #(positive) and #(negative) are almost equal. I am wondering the reason behind, can someone give me some tips?
Moreover, I would like to adjust the pos to neg ratio, how do I implement with torch.multinomial?


This would be the desired outcome as the sampler would try to balance the class indices in the batch.
The reason is that samples with a higher weight will be sampled more likely compared to samples with a lower weight. Since you are defining the weight as the class frequency the minority class samples will be oversampled.