Regression with semi-supervised VAE

I’m currently working on a VAE that does regression with semi-supervised learning. So there is labeled and unlabeled data, and the goal is to achieve acceptable labels for the unlabeled data.

The data (images) is being normalized to the [0,1] range and Adam is used as optimizer.
The architecture is:

  1. Encoder with convolutions and ReLUs
  2. Three fully connected layers with the bottleneck in the middle
    3a) Decoder with ConvTranspose, BatchNorms, ReLUs and Sigmoid
    3b) Regressor with one linear layer, dropout and Softplus

My questions are:

  1. Does it make sense to have 3b separated from the decoder and to feed the last fully connected layer (i.e. the one behind the bottleneck, not the bottleneck itself) to it? I also tried to put the result in an additional image layer of the decoder, taking the average of that layer as the prediction result.
  2. Does it make sense to just add the MSE loss of the regressor to the BCE+KLD loss of the VAE and scale it by some amount, or should in this case MSE also be used for the decoder loss?
  3. Are there any suggestions or tips to this architecture? At the moment it looks strange to me with the MSE loss becoming relatively small compared to the BCE loss, which is not really going down, since the data isn’t binary, but spread over the whole greyscale.

Thanks in advance. :slight_smile:

@Tavados Have you had any sucess with that my friend? I am currently working on a same project and wondered about your second question as well.

Hi, anybody solved the problems? I am focusing on regression problem with discrete data with VAE now, and I tried to use MSE loss instead of BCE as well. The loss value is really low, but output results looked not good. I wonder the reason of this problem.