How to pretrain a CNN?

Yes, I have created the ground truth (1 if the mouse is in the grid, 0 if it is not and there is only one 1 per image).
I am currently trying a VGG16 pretrained on ImageNet and will do as you said. I will let you know if it worked!
Thanks for your help!

EDIT:
Btw, I know I need to preprocess all images according to Preprocess for pretrained VGG_bn model. Do I need the same preprocessing for my 3000 images, or do I have to calculate it manually?