Sampling in PyTorch

I am trying to implement Skip-gram following Mikolov’s paper. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

I am wondering if there is a method to do multiple negative sampling easily with PyTorch. Is there a way to define a distribution by word frequency, and sample n (say, n = 5) word together, for the purpose of computing negative sampling.

I’ve implemented negative sampling before in the dataloader. It’s pretty straightforward, you just need to store the distributions you want to sample from when you create it.

1 Like