Hello, I have a set of ~1300 binary features that I’m trying to embed using a fully-connected VAE. I noticed that the model, regardless of normalization, dropout, activation functions, etc, is no longer learning once it hits a prediction accuracy of 76%. I checked, and this value corresponds to predicting each feature using its mode.

I’ve even expanded the size (and latent space) of this model to check that there is enough complexity. I would expect a model like this to perfectly fit.

```
FcVae(
(encode_list): ModuleList(
(0-2): 3 x FCBlock(
(fc_block): Sequential(
(0): Linear(in_features=1282, out_features=1282, bias=True)
(1): Identity()
(2): LeakyReLU(negative_slope=0.02, inplace=True)
)
)
)
(encode_fc_mean): FCBlock(
(fc_block): Sequential(
(0): Linear(in_features=1282, out_features=1282, bias=True)
)
)
(encode_fc_log_var): FCBlock(
(fc_block): Sequential(
(0): Linear(in_features=1282, out_features=1282, bias=True)
)
)
(decode_list): ModuleList(
(0-2): 3 x FCBlock(
(fc_block): Sequential(
(0): Linear(in_features=1282, out_features=1282, bias=True)
(1): Identity()
(2): LeakyReLU(negative_slope=0.02, inplace=True)
)
)
(3): FCBlock(
(fc_block): Sequential(
(0): Linear(in_features=1282, out_features=1282, bias=True)
)
)
)
)
[Network Embed] Total number of parameters : 14.803 M
```

I should mention the loss I’m using is BCE with ADAM optimizer.