Using a layer parameter in the loss function directly (this is not regularization)

Hi,

My loss function requires me to use the weights of the last layer of the model directly. Here is my use case:

  1. h = output from an intermediate layer
  2. S = weights of the last layer of the model
  3. ypred = output of the model, which are discarded during training. it is only used for inference.
  4. A multivariate normal distribution is constructed by computing mean and covariance for the batch from h and S
  5. The loss is then the negative log likelihood of the actual target computed from the distribution obtained in step 4

So when I train the model in this way, will the model learn, will the last layer of the model be updated accordingly?