Define back-prop on a custom layer by enforcing the sum of its weight to equal 1



l would like to define a new layer before convolutional layer, which is just a weighted linear combination of my inputs with respect to certain scales. What l want is to learn the weights of this linear combination.

My inputs are :


where X_{i} \in \mathbf{R}^{20 \times 20}

The layer that l would like to add is as follow :

X= \sum_{i} W_{i} X_{i}

Where X \in \mathbf{R}^{20 \times 20} and
W \in \mathbf{R}^{5} the parameters to learn with 0 <= W_{i} <= 1 and their sum equals 1

Then X is fed to convolutional layer.

My question is as follow :

How can l define such layer and do back-propagation on it respecting the constraints 0 <= W_{i} <= 1 and their sum equals 1?

Here is what l have tried :

class weighted_linear_combination_layer(nn.module):
        I would like to write a new layer (first layer in my CNN before convolutional layer), which is
        a  linear combination of K inputs. For the sake of illustration we set here K=5.

        X1,X2,X3,X4,X5. each Xi is of dimension 20x20.

        The target : learning a weighted linear combination of Xi as follow :

        Final_X= W1 X1+ W2 X2+ W3 X3+ W4 X4+ W5 X5 # Final_X is of dimension 20x20

        W1, W2,W3, W4,W5 are the scalar parameters of the layer to learn with back-propagation wich are
        initialized at random.

        I would like to ensure that  Wi =>0 and  sum(W1,W2,W3,W4,W5)=1


        def __int__(self):

            self.w2 = nn.Parameter(torch.randn(1))
            self.w3 = nn.Parameter(torch.randn(1))
            self.w4 = nn.Parameter(torch.randn(1))
            self.w5 = nn.Parameter(torch.randn(1))


            self.weights=nn.Parameter(nn.functional.softmax(torch.randn(5))) # weights in range [0,1]  where their sum equal 1

        def forward(self,x):
            # x here is a tensor  5x20x20
            # final_x=self.w1*x[0]+self.w2*x[1]+self.w3*x[2]+self.w4*x[3]+self.w5*x[4]

            return final_x

l would like to keep the fifth weights W1,W2,W3,W4,W5 between [0,1] and their
sum equals 1.

How can l do that ?

for p in weighted_linear_combination_layer.parameters():
    with torch.no_grad():
        p.clamp_(0, 1)
Does this operation do the job ?

Thank you

@rasbt @tom @ptrblck any cue ?

(Clément Pinard) #2

Instead of having the explicit weights W1…W6, maybe you can construct them so that the softmax of them is the weights you want.

def forward(self,x):
    normalized_weights = nn.functional.softmax(self.weights)
    x = nn.functional.linear(x,normalized_weights)