I have a dataset with the following columns:
char2 are integers, whereas
span is a matrix Tensor of integers.
I would like to implement negative sampling so that, for each batch that I retrieve from my DataLoader that wraps the dataset, I also get a batch of negative samples.
For each individual data row retrieved (there may be multiple rows retrieved per batch, of course), I would like to have
N negative samples retrieved as well, so that a negative sample is a single row from any of the
span matrices in my dataset.
Naively, this is how I would retrieve a single negative sample (just to illustrate):
def getNegativeSamples(dataset, N): ret =  for i in range(N): # Choose which row the data will be pulled from dataset_row = dataset[random(0, len(dataset))] span = dataset_row # Choose which row of the span matrix we will use span_entry = span[random(0, len(span))] ret.append(span_entry) # ret has N samples in it return ret
How can I implement this cleanly in PyTorch?