Is ImageNet pre-trained model really important?

When I train object detector, if it is trained based on ImageNet pre-trained model, the loss will drop to 0 through 50 epochs. But if it trained from scratch, the loss still remain high through 100 epochs. I want to know if the ImageNet pre-trained model is required for training?

ImageNet “pre-trained” model means the filters in the model are already “trained” to detect features. So, if you use a pre-trained model in your detector, it already can detect features, and hence, you have the rapid drop of loss within a short time.
However, if you are training a new model from scratch, the filters are learning to detect features and hence, the loss stays high.
So, whether to use a pre-trained model or not will be decided by which kind of training strategy you will choose.
Hope this helps.