What affects quality/accuracy of image classification results?


For a further education I have analyzed PyTorch for image classification (with Kaggle Dogs vs. Cats). In particular I investigated what influences the quality/accuracy of the results.

I claim that the following points are most important (sorted by importance):

  1. amount of data (I recommend at least 20’000 images for categorizations with two types, DogsVsCats example have 25’000)
  2. data quality
  3. division Train/Test-Data (ideal results at 90%/10%)
  4. model used
  5. script modifications (e.g. optimizer, transforms)

The following properties have a small influence on the accuracy:

  • Size of the images
  • Alignment of the images

Furthermore I claim that the hardware used (CPU/CUDA) does not influence the quality of the results, but clearly the duration.

One more question for long-time users:Have new versions of PyTorch significantly improved the quality/accuracy or is that just model related? Unfortunately I cannot test this.

Would you accept this as a professor or have I forgotten important points?