Why Semantic Segmentation Results are blurry/fuzzy. Or can you use semantic segmentation to predict 3 channels rgb from an image?

Hi all,
I am working on semantic segmentation using the UNET architecture. I initially started off by trying to predict the 3 RGB channels from the target. I know that semantic segmentation expects a class to have binary values but gave it a shot anyways. The results I obtain are ok as shown but also blurry/fuzzy and I would like to find the reason or source of this error and also to improve it. Is there a way to improve this for prediction of RGB channels without having to split the image into multiple colors?

pytorch_forum_Segmentation
Note that the top picture shows the target, or label that we want to predict while the bottom shows the prediction output from UNET architecture