I have been working on CIFAR 100 torchvision built in dataset. I wanted to train my model for images with some specific labels and want to remove other training examples. How do do that?
those labels are from the same CIFAR 100, or you want to work with your own labels and images??
labels from cifar 100 only. Like selecting all images with specific labels out of 100 labels and training them.
I dont understand, are images that arent from CIFAR
Training on images from CIFAR 100 but taking only those images from cifar which belongs to specific labels. Like i specify labels 2 and 3 then I should be able to do training of all images belonging to this labels in cifar 100 and remove other label images.
You have two options:
- Keep the original model and only consider the outputs for label 2 and 3
- Make a new head of your model with two outputs (for your classes 2 and 3) and then train the fully connected layers again. You can use the original pretrained FC weights as an initialization for your new FC layers.
Can you please explain how to consider only outputs for specific labels like I am taking a mini-batch from dataset using dataloader. Then in training loop should I check for each image label. If yes then how to not consider them in training(how to remove them) and what should I specify the labels of others images which are not desired.
The images that you are not training your model on should be removed from the training dataset before starting the training process (as it is just wasted computation).
Can you please tell or share some code on how to remove images with specific labels from CIFAR 100 dataset as I am beginner in Pytorch.
In which format is your dataset?
I have taken CIFAR 100 dataset from pytorch torchvisions.datasets then made a dataloader to get images,labels.I don’t actually know the proper format of dataset .I am attaching this link which can give you more idea about dataset.
Here is an alternative. Use fastai_datasets to get your CIFAR100 dataset. The reason being it provides data in Imagenet form i.e. every class/label image are in their separate folders and you can just delete the folders/classes that you do not want.
To create your dataset you would use ImageFolder, and the rest is same.
I think PyTorch provides CIFAR dataset in some batched or pickle form (I have not used this, so I am not fully sure). If you want to use this, then it will involve more work. The basic idea being you would have to iterate over the complete dataloader and use
if conditions to select only those tensors that belong to your class. Now you can store these tensors in another tensor, and then you can save them to disk for future use.