Why models for fine-grained image recognition will overfit on datasets of general image recognition?

Hoodythree · September 12, 2020, 2:34am

Models for fine-grained image recognition, such as Bilinear-CNN, will
overfit on datasets of general image recognition like Cifar10. Why does this happen? It’s about model capacity or something?
Thanks in advance.