Hello experts,
I have some data that is of differing dimension, that I will create a custom model for. That is, when I load my data in pandas it looks like
df=pd.read_csv('data.csv', usecols= ['scalar_feature','vector_feature','matrix_feature','target'])
df['vector_feature']=df['vector_feature'].apply(eval).apply(np.array)
df['matrix_feature']=df['matrix_feature'].apply(eval).apply(np.array)
df.head()
scalar_feature vector_feature matrix_feature target
1.2 [1,3,5] [[1,3,5],[2,1,1] ,[2,1,6]] 1
2.1 [2,1,3] [[3,2,9],[2,2,1] ,[1,0,3]] 4
1.3 [5,2,1] [[2,6,5],[2,2,3] ,[7,0,3]] 2
2.3 [6,1,1] [[1,5,3],[2,4,3] ,[4,1,8]] 3
I can successfully create a dataloader from this dataset, but when I try to run my training loop, I get an error:
class ClassifierDataset(Dataset):
def __init__(self,X_data,y_data):
self.X_data=X_data
self.y_data=y_data
def __getitem__(self, index):
return self.X_data[index],self.y_data[index]
def __len__(self):
return len(self.X_data)
train_dataset=ClassifierDataset(X_train,y_train)
train_loader=DataLoader(dataset=train_dataset, batch_size=1)
for batch_idx, (data, target) in enumerate(train_loader):
etc...
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found object
I have converted the objects already to numpy arrays, and so I am not sure what else I would need to do.
Any advice on how to achieve this?
Thank you very much!