VGGnet and DenseNet training time


I am wondering why DenseNet training time takes way way more than VGGnet,
even though the number of params for DenseNet is much less than the one of VGGNet.
Params for DenseNet : 0.7M
Params for VGGNet : 17M


I know this is a open ended question. But here is some of my thoughts.

Training time depends on lot of things

  1. Hope you are using same setup for both the networks like learning rate and batch sizes and all. Because each net has it’s own recipe to reach the convergence state

  2. Sometime may the net you are training is underfitting the data so then it takes more time in this case DenseNet

  3. Same as above but overfitting in the case of VGGNET. You can test this by checking their accuracies on the training set.

Hi jmandivarapu1,

Thank you for your answer, but
I wasn’t trying to say the convergence or performance aspects.
I just wonder the pure computation time for training between two networks!
I am using exact same environment to those networks.

Then my guess would be. It should take more time for computation of the backward graph for VGG than the dense net. If the rest of the environment variables are the same.