How to setup a custom Dataset to work with PyTorch

So I am trying to create a Dataset class that should later work like any other the standard pytorch Dataset. The raw data contains measurements from accelerometers that were attached to a gearbox. Data was collected in 560 different runs while the health of the gearbox was degrading between the runs. I already extracted 16 features for each run which i would like to use as input data for the nerual network(Deep Forward). The point I am struggeling with, is how to generate a dataset of these runs. I followed this tutorial, but get stuck because my data is two dimensional, where the different runs are arranged in rows and the respective features in columns. This is the code I got so far but I dont know how to access the different rows.

import os
import pandas as pd
from torch.utils.data import Dataset

class Features(Dataset):
    def __init__(self):
        folderpath = r'...\Balanced_Data'
        data_path = os.path.join(folderpath, 'good_and_bad.csv')
        data = (pd.read_csv(data_path, header = None))
        self.samples = data
    
    def __len__(self):
        return len(self.samples)
    
    def __getitem__(self, idx):
        return self.samples.at[idx,0]

When trying to access the single rows,

dataset.samples[0]

yields all entries from the first column.

Since I am fairly new to this topic I very much appreciate any help and tipps for a general approach to this.

Cheers,
Gerrit

If you use the pandas .iloc() function here you can pass in an index and get out the row. The documentation is found here.

That worked,
thanks a lot :+1: