Hi,
basic question/issue about building a vocabulary. The toy code:
from torchtext.vocab import build_vocab_from_iterator
tokens = ['a', 'bc', 'def']
voc = build_vocab_from_iterator(tokens)
voc.get_stoi() # {'f': 5, 'c': 2, 'e': 4, 'd': 3, 'b': 1, 'a': 0}
I would have expected 3 entries in the vocabulary (instead of 6).
Is there another way to achieve this or is this not intended/supported?
Best,
Stephan
torchtext version 0.11.2 on macOS