i am new to pytorch i have one folder which contain train.csv file and train folder train.csv contain image name with corresponding labels and train contains images how to load the images and then train the model.
You could create a custom
Dataset as explained in this tutorial.
Dataset.__init__ method you could load the corresponding
csv file and load each sample in
__getitem__ lazily. To do so, you could index the
csv file (e.g. via a
pd.DataFrame), load (and transform) the corresponding image and create the target tensor.
Let me know, if you get stuck somewhere.
sir my train.csv is in this form
do i need to convert target to one hot vector?
also what is idx in getitem(idx)?
If you are working on a multi-class classification, the targets should be the class indices and should not be one-hot encoded.
E.g. if your use case uses 5 targets, the valid values would be
[0, 1, 2, 3, 4].
For your input data, you could use a mapping, such that e.g.
manipuri maps to
odissi maps to
Dataset.__getitem__(self, index) method is called by the
DataLoader with an index for each sample in the range
[0, len(dataset)] and is responsible to load and return the sample for the current index.
sir after all the loading i trained the model the no of images in train dataset is 364 only. so i want to keep all images corresponding to different classes equally in my train dataset and validation dataset . how can i do that? i am currently using mobilenet_v2 as model since images are less should i write my own model or using this model what can i do to increase accuracy?
364 images are not that many and your model might overfit quickly, especially since you would need to split this dataset into a training, validation, and test set.
You could try to use an aggressive data augmentation and observe the validation loss to make sure the model still generalizes well.
I don’t think that a custom model trained from scratch would be easier, so your best bet might be to try to fine tune a pretrianed model, add data augmentation, maybe increase the regularization, or in the best case collect more data.
how to stratify my image dataset so that my train and validation dataset has equal weightage of classes(i have 8 classes every class must be there in train as well as validation with equal ratio).
Thanks for your help
You could use
sklearn.model_selection.train_test_split with the
This would return indices for the training and validation dataset, which you could then pass to a