I would like to ask if anyone has worked with the pytorch LanguageModelingDataset and BPTTIterator
I’m trying to read in raw text to train a char-rnn
my text file has sentences of varying lengths
The sky is blue today. I feel like going for a walk and ice cream. The day is gloomy, I am staying inside and napping.
trn_data, vld_data, tst_data = datasets.LanguageModelingDataset.splits(TEXT, path="<path to file>", train='training.txt', validation="validation.txt", test='testing.txt')
but I get the error:
TypeError: splits() got multiple values for keyword argument 'path'
and then for the iterator:
train_iter, vld_iter, test_iter = data.BucketIterator.splits((trn_data,vld_data,tst_data), batch_size=batchsize, device=-1, bptt_len=sequence_length, shuffle=True, repeat=False)
Thanks for any help you can provide, always appreciate it.