Torchtext TabularDataset datafields add annotation or multiple label

tiancaigg · April 23, 2020, 11:26am

TEXT = torchtext.data.Field(tokenize = word_tokenize,  fix_length= 1000)
HYBRIDS = torchtext.data.Field()
LABEL = torchtext.data.LabelField(dtype = torch.float)

datafields = {
    "Phenotype": ("labels", LABEL), 
    "sequence": ("text", TEXT),
    "hybrids":("hybrids", HYBRIDS) 
               }

In torchtext, I add a hybrids column, which is the sample’s names, in order to identify which prediction are made to the sample.
but no matter I define it LabelField or Field, it doesn’t work and raise error.
Is there some way to add a name column to the TabularDataset as naming of the sample, but not as label or text for training?

ptrblck · April 24, 2020, 7:04am

What kind of error are you seeing? Could you post the complete error message here, please?