I am a newbie in both Resnet and Pytorch, so please help me.
I am working on the prediction of Sign Language gestures. So, I have a set of images and their labels as the training data. I am using Resnet18 model. I am generating the data through Dataloader, with a batch_size of 10, and feeding it into the model for training. During testing also, Dataloader is used to generate the data and the shuffle is set to True.
When all the 10 images in a batch are of the same class, the prediction comes wrong. However, when there are images of multiple classes in the batch of 10, the prediction is correct and the accuracy comes around 90%.
I don’t understand the reason for this. Please help me.
The code goes like this.
#Training
trainset = torchvision.datasets.ImageFolder(root=‘Training_DataSet-4’, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=10, shuffle=True, num_workers=2)
validation_set = torchvision.datasets.ImageFolder(root=‘Validation_DataSet-4’, transform=transform)
validation_loader = torch.utils.data.DataLoader(validation_set, batch_size=10, shuffle=True, num_workers=2)
net = models.resnet18(pretrained=True)
net=train_cnn(net,trainloader,validation_loader,criterion,optimizer,exp_lr_scheduler,num_epochs=15)
torch.save(net,‘Trained_Model-4.pt’)#Testing
testset = torchvision.datasets.ImageFolder(root=sys.argv[1], transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=10, shuffle=True, num_workers=2)
net=torch.load(‘Trained_Model-unshuffled-4.pt’)
test(net, testloader)