Given a neural network model called **S()** and dataset **X^{5000 \times 18}** where 5000 is the number of samples and 18 its dimensionality.

At the first layer (linear) of **S()** l would like to learn `X= X * A* theta`

such that **A^{18 \times 18}**

With **A** randomly initialized and **theta** the parameters of the model

And `A=B * B^{T}`

with B^{18*60} randomly initiliazed.

I would like to learn A which is the same for all the samples (A as the parameter of model) .

Any trick to do that ?

