How could I use minibatch in my RNNmodel?

I’m a new starter of pytorch. When I get my RNN model trained,I meet some problems about how to implements minibatch
my training data is a python list like

input:[‘word1’,‘word2’,‘word3’,‘word4’]
target: [‘tg1’,‘tg2’,‘tg3’,‘tg4’]

And I learn from web that I need to implements mydataset class and use DataLoader, but problems are as following:

class MyDataset(data.Dataset):
def __init__(self, words, labels):
    self.words = words
    self.labels = labels

def __getitem__(self, index):  
    input, target = self.words[index], self.labels[index]
    return input, target

def __len__(self):
    return len(self.words)
  1. Should the words and labels for dataset be type Variable?
  2. How to code the training part, if I want to have minibatch?

thanks for your answer

No, Tensor/numpy/int/float/etc would be ok, but string is not ok in this case---- you need to convert to num(e.g. word1->1 word2->2)

Actually this is not an issue about pytorch. For NLP problem like this, you need to process your data using tools like token2id(convert word to digit), pad sequence (to the same length) etc.

This may be helpful:

1 Like

Thank you.This helps a lot.