torchtext.vocab build_vocab_from_iterator
hello, help please, gives an error
vocab = torchtext.vocab.build_vocab_from_iterator(iterate_corpus(all_sentenses), specials=["", “”],special_first=True)
TypeError: build_vocab_from_iterator() got an unexpected keyword argument ‘specials’
torch version 1.7.0
torchtext version 0.8.0
python 3.8
tokenizer = torchtext.data.get_tokenizer("spacy", language="ru_core_news_sm")
def iterate_corpus(corpus: List[str]):
for sentense in corpus:
yield tokenizer(sentense)
corpus = construct_dataset("paraphrases.xml")
all_sentenses = corpus["text_1"].tolist() + corpus["text_2"].tolist()
vocab = torchtext.vocab.build_vocab_from_iterator(iterate_corpus(all_sentenses),
specials=["<unk>", "<pad>"],
special_first=True
)