[SOLVED] MNIST dataset structure

HI,

I am learning PyTorch and tested some neural networks with MINST data.
Now I go for a test against custom datasets by referring to MINST data structure, but I cannot see the structure.

train_dataset = MNIST(’./data’, download = False, transform = img_transform)
results in:
Dataset MNIST
Number of datapoints: 60000
Split: train
Root Location: ./data
Transforms (if any): Compose(
ToTensor()
Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
)
Target Transforms (if any): None

and train_loader returns
<torch.utils.data.dataloader.DataLoader at 0x29fe1525860>

Moreover, the Root Location ./data have ‘raw’ and ‘processed’ folders.
Are both of them MNIST data to be used?

How can I see the data structure of MNIST precisely and create my original dataset to be used by the same procedure?

Thank you for advance:)

Petro

I never did this kind of loading are you trying to do…

The thing is the data loader creates a iterator then you must iterate on it …

for data_x, data_y in dataloader:
    pass

Take a look on this tutorial how to create a Dataset Class and use that to wrap your data.
It must implement the following methods :

class CustomDataset(Dataset):
    def __init__(self):
        """ load your data here """
        pass

    def __len__(self):
        """ return the number of observation here """
        pass

    def __getitem__(self, index):
        """ receives a index / range here and returns accordingly """
        pass
  

Then you create a instance of the dataset and pass it to the your dataloader.

1 Like

Dear cyberwillis,

Thank you for your kind reply.

Actually, I showed the image from MNIST data in the iteration and I confirmed that
every unit from the data in fact has a label and image.
However, because I cannot grasp with what format the dataset is formed from the binary(?) data,
it is difficult for me to create another dataset with the same format of MNIST.

Your another suggestion is also very helpful for me.
On the next step, I will see the tutorial and try to understand how should I create a Dataset class and load the data with it.

Thank you very much for your help:)

If you would like to create another Dataset with the same structure as MNIST, e.g. as a replacement, have a look at torchvision.datasets.MNIST. There you will see which data is downloaded, unzipped, and how the binary data is processed.
Maybe that helps.

1 Like

Hi again, sorry I don’t get earlier that you were referring to the real structure of the data itself.

Look for the subject on the page bellow : FILE FORMATS FOR THE MNIST DATABASE
There you can understand better how the data is hold inside the files to create yours.

http://yann.lecun.com/exdb/mnist/

1 Like

Dear ptrblck,

Thank you for your kind reply.
I am sorry for my late reply.

This is the very thing I have looking for! I will read this structure and apply my new dataset.
Thank you very much for your help :slight_smile:

1 Like

Dear cyberwillis,

Thank you again for your reply!
It’s all right, of course, or rather I really appreciate your replies.
This page seems to be fundamental for a better understanding MNIST dataset as you say.
I will try it again.

I couldn’t find this page without your help.
Thank you very much:)

1 Like

Dear ptrblck,

I am sorry that I might have sent the reply below to myself.

Thank you for your kind reply.
I am sorry for my late reply.

This is the very thing I have looking for! I will read this structure and apply my new dataset.
Thank you very much for your help :slight_smile:

1 Like