Formulation for a custom regularizer to minimize amount of space taken by weights

Ge0rges · April 23, 2020, 3:08pm

Hello all.

Below is a screenshot of a custom regularizer I have implemented.

The goal of this regularizer is to minimize the total number of space taken by the weights rather than affect the value of any one weight. However by the nature of a regularizer, I require a formulation for my regularizer that is weight dependent, since otherwise backdrop won’t take it into account.

Essentially, if there are too many weights, it should push all weights to zero equally, and if there are too few weights, push the value of the weights to go up.

My current formulation works but is not weight dependent. I have the function that would push towards the points intended given the number of bytes use and number of bytes available, but as said, it’s weight independent.

I appreciate any input.

PS: While I use numpy operations here, I am aware I need to use torch operations on tensors so as to remain in the computation graph. The problem is that resources_used remains a scalar, and the computations remain weight independent.

albanD · April 23, 2020, 3:27pm

Hi,

That penalty that you compute does not depend on the value of the parameters right? So it is expected that no gradients wrt these values can be computed.

Ge0rges · April 23, 2020, 5:12pm

Exactly, so I’m looking for a formulation that does the same but is weight dependent.

albanD · April 23, 2020, 5:16pm

Ho sorry I missed one paragraph in your question

Well anything related to the number of entries won’t work as it won’t be differentiable.
You can renormalize each value to get their number because that would give you a gradient of 0.
I guess you could try to center the parameter and use the standard deviation? You measure how much each tensor varies from a mean. In theory, you could represent them as a mean value and their offsets. So the closer they are, the better?

googlebot · April 23, 2020, 5:38pm

Perhaps L1 penalty (see here) proportional to what you compute would work?

Ge0rges · April 23, 2020, 7:09pm

Thank you for both answers have given me some ideas to think about.