Help with Torchtext Tabulardataset json format

Hi,

I’m currently trying to load json data using the torchtext data.TabularDataset.split

My json file is the following

[
{"text": "anything here", "class_label": "positive"},
{"text": "anything here 2", "class_label":"negative"}
]

The following is the datafield format I defined

datafield = { "text": ("text": data.field()),
                     "label": ("label": data.field())
                    }

yet I’m getting the following error:

ValueError: Specified key **text** was not found in the input data

was wondering if I’m missing something or if my json format is incorrect. Thank you for the assistance.

I found the solution through some further experimentation with the format

the correct json file format for the dataset to read in is the following:

{"text": "anything here", "class_label": "positive"}
{"text": "anything here 2", "class_label":"negative"}

Notice the [] are gone and the comma is gone, each new line is a new record

5 Likes