What are good practices for using GANs with labels?

Creating a GAN I have read that using class labels can be extremely useful.
I know that there are lots of different GAN architectures that utilize labels and I wonder if any of them have more proof to back up their strength than any other? And where to start looking for fair comparisons.

I know there are architectures like CGAN, InfoGAN, ACGAN.
Some of these use labels as inputs to the discriminator. For low res images like MNIST it makes sense to concat a flattened image with the label but how does this translate to using larger images?

My current thought is to use multiple 2D convolutions, eventually feeding to linear layers that also takes the label inputs (OHE, binary input per neuron). Does this design make sense?