I am applying a Bayesian model for a CNN that has many layers (more than 3), using Stochastic Variational Inference in Pyro Package.
However after defining the NN, Model and Guide functions and running the training loop I found that the loss stops decreasing on loss ~8000 (which is extremely high). I tried different learning rates and different optimization functions but non of them reaches a loss lower than 8000.
At last I changed the autoguide function (I tried AutoNormal, AutoGaussian, AutoBeta) they all stopped dicreasing the loss at the same point.
The last thing I did is trying the AutoMultivatiateNormal and for this it reached negative values (it reached -1) but when I looked at the weight matrices I found that they are all turned into scalers!!
The following graph represent the loss pattern for the AutoMultivatiateNormal