Idea for CV Project, is it realistic and best approach?

Hi there!

I love building models from scratch, or I love building stuff in general. I recently built a conditional GAN model, where the trained generator takes a label and creates a fake image, that should resemble the class associated with that label.
Since I do want to transform this little project into something bigger and visually pleasing, I thought about training that model on animals, that is, the generator can create a lion, dog, or whatever it was trained on when given that class as an input. This is something I should be able to do with some time and effort.
However, my main goal is to let the user create a zoo, that means, the user can name some animals, like monkey and lion, and the final picture should be a monkey and a lion in a zoo, in realistic positions. That is, if there is a tree, the monkey should sit in a realistic position on the tree, and the lion should lie on the grass.
Those are just some stupid ideas, but I wonder, if that is achievable at all? If so, where should I start?
Any help is greatly appreciated!