How would one create a Conditional Variational Autoencoder with Convolutional layers?

I was trying to find an example of a Conditional Variational Autoencoder that first uses convolutional layers and then fully connected layers, which would be necessary if dealing with larger images (I created a CVAE on the MNIST dataset which only used fully connected layers). Does anyone know of any CVAE which also uses convolutional layers before the fully connected layers that acts on larger images, and/or any tips as to how I should adapt my code? Thanks.
Link to original code: CVAE-using-MNIST-dataset/ at main · StructuredProgramming/CVAE-using-MNIST-dataset · GitHub