Perhaps you may get a much better answer (and understanding) by looking at some videos on CNN basics (one here: Convolution Ne).
The model you’ve mentioned here seems incomplete for any deep learning task. Note that your input and output layer is largely defined by what you want to achieve. It may be more efficient if you have a go at tutorials and beginner resources on CNNs available online first! (there are plenty out there)