How to preserve GT image texture when upsampling with transposeConv2d?

Hi all,

My input is a concatenated embedding vector,
I apply series of (transposeConv2d->BN->Relu) X5 to get an image size matrix with one channel.

After “supervised” training I get an image which is very similar to the general distribution of the GT image, although it slightly smooth, and the texture isn’t preserved locally.
I tried to train with variety of hyper-parameters and number of samples - all return the same visual
bad image quality.

Should I use skip connections to solve the texture issue? any relevant idea?
I’d be happy to hear any suggestions to improve this vanilla Decoder architecture.

Thanks,
Or