Custumized dataset from JSON file

Iam working using JSON files, the structure of the file is :
{“data”: [{
“ID”:…,
“SensorName”:… ,
“DigitalValue”: …,
“AnalogValue”:… ,
“SchematicName”:…,
“Pin”:…,
“Enable”:…,
“TimeStamp”:…,

}]}

each file is made of many structure like the one above I was wondering how to write a custumized dataset for pythorch using this I get errors:

class JsonDataset(IterableDataset):
def init(self, files):
self.files = files
self.label=label
print('self files ',self.files)
input()

def __len__(self):
    
    return ()
         
def __getitem__(self, idx):
    print('get item hello')
    input()
    data = []
    for k, v in idx['data'][0].items():
        print('v is ',v)
        print()
        print('k is ',k)
        input()
        if type(v) == str:
            # convert it into int or float
            v=int(v)
            data.append(v)
            # and append into data
            
            
        else:
            
            data.append(v)
           
            
            
    torch_data = torch.Tensor(data)
    label=Y
    input()

thanks

Hi, based on your code snippet, it doesn’t seem like your __getitem__ method is returning anything. Maybe try adding return torch_data at the end?

If you include your error message here, it will be helpful as well.

1 Like

thanks , I think my own issue is that I am not able to iterate my json file for which the structure is:

{“serial_number”:"…",“timestamp”:…,“data”:[…]}

do you have an idea about how to iterate this specefic type of structure ?

PS: i have many files so the structure of the second type is different compared to the one posted already

Depending JSON file structure and which level you would like to iterate through, it will be different.

If the JSON file is a list of entries, and each entry is a dict like this {“serial_number”:"…",“timestamp”:…,“data”:[…]}, then:

dictionaries = json.load(f)
for row in dictionaries:
    data = row['data']

If the JSON file is dict with keys such as “serial_number”, “timestamp”, “data” (i.e. {“serial_number”:"…",“timestamp”:…,“data”:[…]}) , then:

d = json.load(f)
serial_number = d['serial_number']
timestamp = d['timestamp']
data = d['data']
for point in data:
    # You can iterate through the data here, or you can just convert it to a Tensor
1 Like

both lines of json.load(f) get me to this error:
raise JSONDecodeError(“Extra data”, s, end)

JSONDecodeError: Extra data

It is hard to debug without being able to see the file that you are trying to load, but the answers here should help: Python json.loads shows ValueError: Extra data - Stack Overflow

1 Like

it seems that my file is not well formatted some how linked to the {}, thnaks for the help.