So I am trying to create a Dataset class that should later work like any other the standard pytorch Dataset. The raw data contains measurements from accelerometers that were attached to a gearbox. Data was collected in 560 different runs while the health of the gearbox was degrading between the runs. I already extracted 16 features for each run which i would like to use as input data for the nerual network(Deep Forward). The point I am struggeling with, is how to generate a dataset of these runs. I followed this tutorial, but get stuck because my data is two dimensional, where the different runs are arranged in rows and the respective features in columns. This is the code I got so far but I dont know how to access the different rows.
import os
import pandas as pd
from torch.utils.data import Dataset
class Features(Dataset):
def __init__(self):
folderpath = r'...\Balanced_Data'
data_path = os.path.join(folderpath, 'good_and_bad.csv')
data = (pd.read_csv(data_path, header = None))
self.samples = data
def __len__(self):
return len(self.samples)
def __getitem__(self, idx):
return self.samples.at[idx,0]
When trying to access the single rows,
dataset.samples[0]
yields all entries from the first column.
Since I am fairly new to this topic I very much appreciate any help and tipps for a general approach to this.
Cheers,
Gerrit