I am working on a problem where I have multiple CSVs files and I need to read those multiple CSVs one by one with a sliding window. Let’s assume that, one CSV file is having 330 data points and the window size is 32 so we should be having (10*32 = 320) and the last 10 points will be discarded.
I started making a dataset that looks like this but after spending too much time, I am not able to get it working. The current code looks like this,
class CustomDataset(Dataset):
def __init__(self, data_folder, window_size):
self.data_folder = data_folder
self.data_file_list = [file for file in os.listdir(data_folder)]
print(self.data_file_list)
self.window_size = window_size
def __len__(self):
return len(self.data_file_list[0])
def __getitem__(self, idx):
filename = self.data_file_list[idx]
data, label = read_file(filename)
return data, label
def read_file(self, filename):
data = pd.read_csv(filename)
data = data.drop(["file_name", "class_name"], axis = 1)
features = data.drop(["class_no"], axis = 1)
labels = data["class_no"]
x = [features[index:index+self.window_size].values for index in range(0, len(features))]
y = [labels[index:index+self.window_size].values for index in range(0, len(labels))]
return x, y
Note: I can’t merge all these CSV files into one.
I am getting this error,
TypeError: object of type ‘type’ has no len().