Where and what are the design rules for optimal CNNs

Being new to Pytorch, I’ve run a few CNNs on my multichannel time series data, but I’m keen to understand how the design rules for an optimal classification CNN are related to the input signal characteristics. After reading books (by eg Goodfellow, Brunton, Geron, Chollet etc) and appreciating the wide variety of CNNs from the various *Net competitions, I’m amazed there’s so little on the internet and in books about the design rules, eg. what decides the optimal depth and width of a CNN. Might there be any useful texts that could help me in this area. NZ