Currently doing the following:
def tokenize_text(text):
return [letter for letter in text]
def tokenize_ints(int_lst):
return int(int_lst[0])
text_Field = Field(tokenize = tokenize_text,
eos_token = '<eos>')
index_Field = Field(tokenize= tokenize_ints)
train_Datafields = [("Index", None),("SRC", text_Field), ("TRG", text_Field)]
train_data = TabularDataset(
path = "train.csv",
format = "csv",
skip_header = True,
fields = train_Datafields)
text_Field.build_vocab(train_data)
How do I use that vocab to transform dataset into one hot encoded vectors so that I can pass that into a net?