Hi everyone
I have the same problem, and I canāt solve it. I want to use my own MNIST data set.
I have two .csv files for train and test data and labels (I mean that in each .csv file, the first column is the labels and other columns from 2 to 785 are pixels of images).
read the data
df_train = pd.read_csv(āTrain_Data_FS_with_Label.csvā,header=None)
df_test = pd.read_csv(āTest_Data_FS_with_Label.csvā,header=None)
get the image pixel values and labels
train_labels = df_train.iloc[:, 0]
train_images = df_train.iloc[:, 1:]
test_labels = df_test.iloc[:, 0]
test_images = df_test.iloc[:, 1:]
define transforms
transform = transforms.Compose(
[transforms.ToPILImage(),
transforms.RandomCrop(24),
transforms.ToTensor()
])
custom dataset
class MNISTDataset(Dataset):
def init(self, images, labels, transforms):
self.X = images
self.y = labels
self.transforms = transforms
def len(self):
return (len(self.X))
def getitem(self, i):
data = self.X.iloc[i, :]
data = np.asarray(data).astype(np.uint8).reshape(24,24, 1)
if self.transforms:
data = self.transforms(data)
if self.y is not None:
return (data, self.y[i])
else:
return data
train_data = MNISTDataset(train_images, train_labels, transform)
test_data = MNISTDataset(test_images, test_labels, transform)
dataloaders
trainloader = DataLoader(train_data, batch_size=1, shuffle=True)
testloader = DataLoader(test_data, batch_size=1, shuffle=True)
and I need to split the test data set into 2 parts:
test_ds, valid_ds_before = torch.utils.data.random_split(testloader , (9500, 500))
small_shared_dataset = create_shared_dataset(valid_ds_before, 200)
In the ācreate_shared_datasetā function:
def create_shared_dataset(valid_ds, size):
data_loader = DataLoader(valid_ds, batch_size=1)
for idx, (data, target) in enumerate(data_loader): (error occur here!!)
ā¦
my code has error in this part: āāDataLoaderā object is not subscriptableā
how can I solve this error??
I would be very grateful for any help you can give me.