Efficient layer combination needed

Hi, I’d like to train a NN with a given dataset (all including some kind of object, for example: a dog), after the training the NN should help me classifying my images (downloaded from instagram) as “image includes a dog (with probability:0.XX)” or “image doesn’t include a dog (with probability: 0.XX)”.

Obviously images from instagram-images do not always have the same size (but they all have the same format (.jpg) due to filtering), and the images from my dataset do not have the same size as well.

How should I handle this? Wich NN Architecture is recommended?
Should I just center crop all images to the same size?

While testing I’m using the " Stanford Dogs Dataset", but the final product should be able to handle different training datasets as well.
The NN does not have to be extremely good, I just want to filter the downloaded images in the groups:

  1. has the given object
  2. don’t has the given object

With, for example, the confidence of 0.6 .

Thanks in advance

I think you just need resize all images to have the same size.

Fixed all bugs now.
Just need help with the neuronal network architecture.
I have images with the type: input[1, 3, 256, 256]. Now my Questions:

What kind of layer combination would work well?
What parameters do I have to use in the first layer?
For my kind of use the output layer should have 2 neurons, right?

My Code: https://hastebin.com/igekafidos.py

Thanks for every reader.