Hi ptrblock,
Thanks for responding! How can I check that?
It is defined as this:
self.text = data.Field(
tokenize=tokenizer,
lower=True,
include_lengths=True,
preprocessing=generate_n_grams,
)
After loading it:
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'batch_first', 'build_vocab', 'dtype', 'dtypes', 'eos_token', 'fix_length', 'ignore', 'include_lengths', 'init_token', 'is_target', 'lower', 'numericalize', 'pad', 'pad_first', 'pad_token', 'postprocessing', 'preprocess', 'preprocessing', 'process', 'sequential', 'stop_words', 'tokenize', 'tokenizer_args', 'truncate_first', 'unk_token', 'use_vocab', 'vocab', 'vocab_cls']
It does have vocab, so it is very confusing.
If I stuff like
self.text.vocab.freqs.most_common(20)
it seems to work fine.