Hello,
I have a model that will take results from three different models based on the types of pictures and combine them for a final binary classification. The data is pulled based on the location of a photo pulled from a dataframe.
Here is a snapshot:
Is there a way that I can skip the NaN values or fill them with a tensor of 0s the size (224x224)?
If the above is possible would I do that in the model object or in the custom data pull object?
Is there a better way to do this?
I was thinking:
class image_Dataset(Dataset):
'''
image class data set
'''
def __init__(self, data, transform = None):
'''
Args:
------------------------------------------------------------
data = dataframe
image = column in dataframe with absolute path to the image
label = column in dataframe that is the target classification variable
numerical_columns = numerical columns from data
categorical_columns = categorical columns from data
policy = ID variable
'''
self.image_frame = data
self.transform = transform
def __len__(self):
return len(self.image_frame)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
label = self.image_frame.loc[idx, 'target']
if self.image_frame[self.image_frame[idx, 'Roof'].isna() == True]:
pic = np.ones(3,224,224)
else:
pic = Path(self.image_frame.loc[idx,'Roof'])
img = Image.open(pic)
if self.transform:
image = self.transform(img)
policy = self.image_frame.loc[idx, 'policy']
numerical_data = self.image_frame.loc[idx, numerical_columns]
numerical_data = torch.tensor(numerical_data, dtype = torch.float)
for category in cat_columns:
self.image_frame[category] = self.image_frame[category].astype('category')
self.image_frame[category] = self.image_frame[category].astype('category').cat.codes.values
categorical_data = self.image_frame.loc[idx, cat_columns]
categorical_data = torch.tensor(categorical_data, dtype = torch.int64)
return image, label, policy, categorical_data , numerical_data
But this still throws and error when there is a NaN value.
I also tried passing a pass
statement when it encountered a NaN value. No luck.