Training specific examples from CIFAR 100

Rohit_G · May 21, 2020, 3:36pm

I have been working on CIFAR 100 torchvision built in dataset. I wanted to train my model for images with some specific labels and want to remove other training examples. How do do that?

jasg · May 21, 2020, 3:45pm

those labels are from the same CIFAR 100, or you want to work with your own labels and images??

Rohit_G · May 21, 2020, 4:00pm

labels from cifar 100 only. Like selecting all images with specific labels out of 100 labels and training them.

jasg · May 21, 2020, 4:05pm

I dont understand, are images that arent from CIFAR

Rohit_G · May 21, 2020, 4:23pm

Training on images from CIFAR 100 but taking only those images from cifar which belongs to specific labels. Like i specify labels 2 and 3 then I should be able to do training of all images belonging to this labels in cifar 100 and remove other label images.

Kushaj · May 21, 2020, 6:11pm

You have two options:

Keep the original model and only consider the outputs for label 2 and 3
Make a new head of your model with two outputs (for your classes 2 and 3) and then train the fully connected layers again. You can use the original pretrained FC weights as an initialization for your new FC layers.

Rohit_G · May 21, 2020, 6:24pm

Can you please explain how to consider only outputs for specific labels like I am taking a mini-batch from dataset using dataloader. Then in training loop should I check for each image label. If yes then how to not consider them in training(how to remove them) and what should I specify the labels of others images which are not desired.

Kushaj · May 21, 2020, 6:26pm

The images that you are not training your model on should be removed from the training dataset before starting the training process (as it is just wasted computation).

Rohit_G · May 21, 2020, 6:33pm

Can you please tell or share some code on how to remove images with specific labels from CIFAR 100 dataset as I am beginner in Pytorch.

Kushaj · May 21, 2020, 6:39pm

In which format is your dataset?

Rohit_G · May 21, 2020, 6:54pm

I have taken CIFAR 100 dataset from pytorch torchvisions.datasets then made a dataloader to get images,labels.I don’t actually know the proper format of dataset .I am attaching this link which can give you more idea about dataset.
https://pytorch.org/docs/stable/_modules/torchvision/datasets/cifar.html#CIFAR10

Kushaj · May 21, 2020, 10:09pm

Here is an alternative. Use fastai_datasets to get your CIFAR100 dataset. The reason being it provides data in Imagenet form i.e. every class/label image are in their separate folders and you can just delete the folders/classes that you do not want.

To create your dataset you would use ImageFolder, and the rest is same.

I think PyTorch provides CIFAR dataset in some batched or pickle form (I have not used this, so I am not fully sure). If you want to use this, then it will involve more work. The basic idea being you would have to iterate over the complete dataloader and use if conditions to select only those tensors that belong to your class. Now you can store these tensors in another tensor, and then you can save them to disk for future use.