Constrains on convolution layer weights

I would like to enforce the weights to sum to 1, dividing by the sum of the matrix is not enough since each value can be greater then 1.
For example, the following weight matrix is possible:
kernel = torch.FloatTensor([[0,-1,0],[-1,5,-1],[0,-1,0]])

Any idea how can I enforce it?
Thanks a lot!

Use softmax :slight_smile:

>>> import math
>>> z = [1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0]
>>> z_exp = [math.exp(i) for i in z]
>>> print([round(i, 2) for i in z_exp])
[2.72, 7.39, 20.09, 54.6, 2.72, 7.39, 20.09]
>>> sum_z_exp = sum(z_exp)
>>> print(round(sum_z_exp, 2))
114.98
>>> softmax = [round(i / sum_z_exp, 3) for i in z_exp]
>>> print(softmax)
[0.024, 0.064, 0.175, 0.475, 0.024, 0.064, 0.175]

Code credits: wikipedia
PD: Use torch softmax, it’s already coded

Thanks!
But I would like values also greater then one. When using softmax, it will not be possible to get the kernel in the example, right?

Yep, However you are ill-posing this problem because there are multiple solutions:
However you can reach the same effect just removing the exponential

>>> import math
>>> z = [1.0, -2.0, -3.0, 4.0, 1.0, 2.0, 3.0]
>>> z_exp = [float(i) for i in z]
>>> print([round(i, 2) for i in z_exp])
>>> sum_z_exp = sum(z_exp)
>>> print(round(sum_z_exp, 2))
>>> softmax = [round(i / sum_z_exp, 3) for i in z_exp]
>>> print(softmax)
[1.0, -2.0, -3.0, 4.0, 1.0, 2.0, 3.0]
6.0
[0.167, -0.333, -0.5, 0.667, 0.167, 0.333, 0.5]
sum(softmax)
Out[14]: 1.001

You should also pick the absolute value not to invert the sign if the sum is negative. It’s not defined when the sum is 0 since weights already sum 1

Thanks a lot for the response.
Can you elaborate what do you mean by:
“You should also pick the absolute value not to invert the sign if the sum is negative. It’s not defined when the sum is 0 since weights already sum 1”
In addition, it is ill posed problem anyway, why do you say that when enabling negative values makes it ill posed problem?
Thanks!

I mean, in the last code I wrote I just simply sum all the values. If that sum were negative, the output values would have the sign inverted wrt originals.

>>> import math
>>> z = [-1.0, -1.0, -3.0, -4.0, 1.0, 2.0, 3.0]
>>> z_exp = [float(i) for i in z]
>>> print([round(i, 2) for i in z_exp])
>>> sum_z_exp = -sum(z_exp)
>>> print(round(sum_z_exp, 2))
>>> softmax = [round(i / sum_z_exp, 3) for i in z_exp]
>>> print(softmax)
[-1.0, -1.0, -3.0, -4.0, 1.0, 2.0, 3.0]
3.0
[-0.333, -0.333, -1.0, -1.333, 0.333, 0.667, 1.0]

sum(softmax) = -1 and original sign is preserved

However, if you dont pick the absolute value

>>> import math
>>> z = [-1.0, -1.0, -3.0, -4.0, 1.0, 2.0, 3.0]
>>> z_exp = [float(i) for i in z]
>>> print([round(i, 2) for i in z_exp])
>>> sum_z_exp = sum(z_exp)
>>> print(round(sum_z_exp, 2))
>>> softmax = [round(i / sum_z_exp, 3) for i in z_exp]
>>> print(softmax)
[-1.0, -1.0, -3.0, -4.0, 1.0, 2.0, 3.0]
-3.0
[0.333, 0.333, 1.0, 1.333, -0.333, -0.667, -1.0]

They sum 1, but the sign is inverted.

Besides, as you divide by the sum, if sum=0 values tend to inf. Sum = 0 means values already sum 1.

Lastly, it’s an ill-posed problem because as you apply no constrains in the result you can achieve that easily. For example just setting everything to zero but one of the values.
Or another example, you dont like softmax because it does not preserve the sign, however, weights sum up to 1. There are lot of ways of making them sum 1, but you didn’t say which properties you want this transformation to have.
In this one, weights are proportional to the original values, softmax version

>>> import math
>>> z = [-1.0, -2.0, -3.0, 4.0, 1.0, 2.0, 3.0]
>>> z_exp = [math.exp(i) for i in z]
>>> print([round(i, 2) for i in z_exp])
>>> sum_z_exp = sum(z_exp)
>>> print(round(sum_z_exp, 2))
>>> softmax = [round(i / sum_z_exp, 3) for i in z_exp]
>>> print(softmax)
[0.37, 0.14, 0.05, 54.6, 2.72, 7.39, 20.09]
85.34
[0.004, 0.002, 0.001, 0.64, 0.032, 0.087, 0.235]

Penalizes negative values.

Thanks!
Anyway, I don’t see a solution to my problem, I’ll never get the kernel in the example I wrote :frowning:

I don’t undertand what do you mean. You will never get the kernel?

sorry and thanks again for the help
I would like the network to converge to kernel with sum of 1 however, each value in the kernel can be also number that is not smaller then 1. For example, I want it to be able to converge to:
-1, 0, -1
0, 5, 0
-1, 0, -1
The sum is equal to ‘1’.