Custumized dataset from JSON file

Oussama_Bouldjedri · January 19, 2022, 8:28pm

Iam working using JSON files, the structure of the file is :
{“data”: [{
“ID”:…,
“SensorName”:… ,
“DigitalValue”: …,
“AnalogValue”:… ,
“SchematicName”:…,
“Pin”:…,
“Enable”:…,
“TimeStamp”:…,

}]}

each file is made of many structure like the one above I was wondering how to write a custumized dataset for pythorch using this I get errors:

class JsonDataset(IterableDataset):
def init(self, files):
self.files = files
self.label=label
print('self files ',self.files)
input()

def __len__(self):
    
    return ()
         
def __getitem__(self, idx):
    print('get item hello')
    input()
    data = []
    for k, v in idx['data'][0].items():
        print('v is ',v)
        print()
        print('k is ',k)
        input()
        if type(v) == str:
            # convert it into int or float
            v=int(v)
            data.append(v)
            # and append into data
            
            
        else:
            
            data.append(v)
           
            
            
    torch_data = torch.Tensor(data)
    label=Y
    input()

thanks

nivek · January 19, 2022, 9:34pm

Hi, based on your code snippet, it doesn’t seem like your __getitem__ method is returning anything. Maybe try adding return torch_data at the end?

If you include your error message here, it will be helpful as well.

Oussama_Bouldjedri · January 20, 2022, 1:48am

thanks , I think my own issue is that I am not able to iterate my json file for which the structure is:

{“serial_number”:"…",“timestamp”:…,“data”:[…]}

do you have an idea about how to iterate this specefic type of structure ?

PS: i have many files so the structure of the second type is different compared to the one posted already

nivek · January 20, 2022, 4:07am

Depending JSON file structure and which level you would like to iterate through, it will be different.

If the JSON file is a list of entries, and each entry is a dict like this {“serial_number”:"…",“timestamp”:…,“data”:[…]}, then:

dictionaries = json.load(f)
for row in dictionaries:
    data = row['data']

If the JSON file is dict with keys such as “serial_number”, “timestamp”, “data” (i.e. {“serial_number”:"…",“timestamp”:…,“data”:[…]}) , then:

d = json.load(f)
serial_number = d['serial_number']
timestamp = d['timestamp']
data = d['data']
for point in data:
    # You can iterate through the data here, or you can just convert it to a Tensor

Oussama_Bouldjedri · January 20, 2022, 7:37pm

both lines of json.load(f) get me to this error:
raise JSONDecodeError(“Extra data”, s, end)

JSONDecodeError: Extra data

nivek · January 20, 2022, 7:51pm

It is hard to debug without being able to see the file that you are trying to load, but the answers here should help: Python json.loads shows ValueError: Extra data - Stack Overflow

Oussama_Bouldjedri · January 20, 2022, 9:54pm

it seems that my file is not well formatted some how linked to the {}, thnaks for the help.