Fine-tuning of pretrained networks : Densenet is slow to converge?

I benchmarked some pretrained networks, VGG and DenseNet and Resnet. The problem is, all of the variable of the densenet shows slow convergence on same parameter, compared to VGG or resnet. Is densenet needs special approaches to fine-tuning network? Or densenet 's convergence is slower than others?

Training condition :

  • dataset : dog and cats (1000 img per class)
  • data augmentation : RandomResizedCrop(224x224), RandomHorizontalFlip, RandomVerticalFlip
  • batch_size : 64
  • optimizer : Adam, default setting
  • criterion : cross entropy
  • learning rate : start from 0.001 , multiplied by 0.5 for each 50 epoch.
  • total epoch : 200

Test condition

  • dataset : dog and cats (200 img per class), unseen to model
  • data augmentation : Resize(224x224)
  • batch_size = 64