Feedback on converting a 2D array into a 3D array of images for CNN training

antz · April 25, 2022, 3:59am

Hi,

I am new to Pytorch and image classification/training along with all the array manipulations as well, so please bear with me.
I am trying to train Fashion MNIST with a machine learning model/library from scikit-learn and the same dataset with a CNN model. It is an exercise for me to understand what is happening and to gain some practice.
I am obtaining the data using "fetch_openml(name = “Fashion-MNIST”

>data1 = fetch_openml(name ="Fashion-MNIST")
>data1.target.shape  ##the classes , "y"
(70000,)
>np.shape(data1.data)
(70000, 784)

So turns out the above data is a pandas Dataframe. I converted it to numpy with:

Xdata = data1.data.to_numpy()
Ydata = data1.target.to_numpy()

Split data:

x_train, x_test, y_train, y_test = train_test_split(Xdata, Ydata, test_size=0.2, train_size=0.8, random_state=2)

Custom Dataset for Dataloader:

class CData(Dataset):

    def __init__(self,x_data,y_data):

        self.x_data, self.y_data = torch.from_numpy(x_data), torch.from_numpy(y_data)

    def __len__(self):
        return len(self.x_data)

    def __getitem__(self, i):
        return self.x_data[i], self.y_data[i]

Data Loader:

trainloader = DataLoader(CData(x_train,y_train),batch_size=32)
testloader  = DataLoader(CData(x_test,y_test))

Excerpt of training loop:

for epoch in range(epochs):
    for images, labels in trainloader:
        images, labels = images.to(device), labels.to(device)
        images = images.reshape([32,1,28,28])
        # print(type(images))      
        # warp input images in a Variable wrapper
        images = Variable(images)
        optimizer.zero_grad()
        outputs = net(images.float())
        # Calculate the loss
        loss = F.cross_entropy(outputs,labels.long())
        # Calculate gradient w.r.t the loss
        loss.backward()
        # Optimizer takes one step
        optimizer.step()
        # get the predicted class from the maximum value in the output-list of class scores
        pred = outputs.argmax(dim=1, keepdim=True)
        correct = pred.eq(labels.view_as(pred)).sum().item()
        train_acc =  correct/batch_size

    # calculate the accuracy
    #scheduler.step()
    print(train_acc)

My Questions:
My results seem a little weird, so I want to know:

How do I debug/know that I have the images loaded properly into the Tensor Dataset?
Is there a more efficient/better way to do the data loading/ reshaping?
Is there an inbuilt CNN model I can use for my dataset to see if there is an issue with my dataset or with my model? If there is, how would I go about using it?
Miscellaneous:
I am converting from numpy/scikit-learn to pytorch Dataset, because I started with the scikit model first.

antz · May 3, 2022, 3:42pm

Bump, trying to get some feedback on my workflow above.

eqy · May 4, 2022, 3:46am

I would start with some sanity checks to make sure that everything is expected e.g.,
assert len(self.x_data) == 70000
assert len(self.y_data) == 70000
I would then check that the initial loss is roughly -ln(1/num_classes) or in this case: ~2.30258509299.

As for inspecting that a batch of images looks as expected, you can convert the tensor back to a numpy array and save an image from there: Converting tensors to images

From there I would check if the training accuracy goes to 100 when the training set is very small and reused as the test set.