I am building a cycleGAN with a u-net structure to improve the image quality of Cone Beam Computer Tomography (CBCT), with the Fan Beam Computer Tomography FBCT) being the target images. Because I want to mainly enhance the quality in the lung region, I crop the lung volumes out of the original images and assign value 0 to the remaining regions other than the lung (given that the lung pixel value ranges from -1000 to 0). Scaling and normalization are done to the array with the range of -1000 to 0. Please see the following images as an example for my dataset:
In the example above, the left most image is the CBCT and the right most image is the FBCT. The middle one is the generated image from the CBCT by the cycleGAN model. This is actually an example from the very early epoch of the training.
But when the model goes on training, it somehow gradually loses its ability to capture the anatomy of the images, and eventually generate a blank image with all value 0. (see image below)
The loss along the training is climbing back up after several epoch and it starts to lose the anatomy information.
What makes me curious is that such loss of anatomy does not occur when I simply input the whole CBCT and FBCT images as the dataset, without doing lung segmentation nor value assignment to the regions outside the lung. If un-segmented images are given, the model actually successfully translate the CBCT into mimicking the FBCT quality. I do the segmentation since I want the model to only concentrate in the lung region to see if it performs better.
I wonder if this is the consequences with that the background has extremely high value than the region of interest (i.e. background value: 0; lung value: -1000 to 0). Is there any work published on cycleGAN training with images containing blank background? If yes, is there any special measure when assigning value to the background, or when doing normalization and scaling? I can’t really find any so far.
Any insight is appreciated. Thank you.