Hi,
I make CNN network based on tutorial: How We Can Give Our Computers Eyes and Ears, because this topic resolve similar problem - in my program I try recognize people by ear picture from camera (I use OpenCV).
So, I have first demo version, and I have a few questions about it:
- In this tutorial was used transform:
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
if I use data from resnet I must transform frame by this ?
mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]
-This is my prediction function. In output (score variable) I have other value between (~ -6: ~6)(I’m not sure)
def argmax(prediction):
prediction = prediction.cpu()
prediction = prediction.detach().numpy()
top_1 = np.argmax(prediction, axis=1)
score = np.amax(prediction)
score = '{:6f}'.format(score)
prediction = top_1[0]
result = dog_breeds[prediction]
return result,score
I expected value between (0,1), because in many examples, people use if(score >0.5) to determine if the prediction is good.
In tutorial the author uses transfer learning. After he don’t use his database, only database from resnet, so if I use pretrain network I don’t have to have own database? I don’t see the place in program where he connects his database and database from resnet. I don’t understand this.
Thanks for all answers !
Have a nice day