# How to correctly impose a weight constraint

I have the following model;

X_i = (c_1i X_1, …, c_di X_d) for i=1,…,d;

This leads to a matrix C of d*d parameters. For reasons that are tedious to explain, I, therefore, have a d x d matrix of estimated parameters with

X^est_i = (c^est_1i X_1, …, c^est_di X_d)

and choose my loss to be

max_{i=1,…,d}(X^est_i / X_i)

Observe that all values in C as well as X are non-negative; This problem should be very easy to solve as it is convex (reducing a value c_ji never increases the loss)

Now I want to implement that the sum of all entries of C to be equal (or larger) to a certain threshold, let us call is eps;

What is the best way to do it? I found three solutions that all seem to fail; Either the estimator gets stuck or goes to infinity; The 3 approaches I thought of:

1. only define d^2-1 values and then just assign the last value as torch.abs(eps-sum(C)) - then the sum of all values is always equal to eps;

2. Define d^2 values and scale up the values in each iteration by multiplying each weight by eps/sum(C)

3. give a penalty lambda*torch.abs(eps-sum(C)) - for lambda large enough, this will enforce the sum of the values to be eps;

So I tried all of these approaches but all of them seem to fail; Either they get stuck or they converge to infitinity;

Below I implemented the first approach

``````import torch
import torch.nn as nn
import numpy as np
import scipy.stats as st
import torch.optim as optim
import numpy as np
import copy

class Network(nn.Module):
def __init__(self, dim):
super(Network, self).__init__()

d=dim
self.linears = nn.ModuleList([nn.Linear(1, d, bias=False) for i in range(d-1)])
self.final_layer = nn.Linear(1, (d-1), bias=False)
self.dim = dim

def forward(self, x):
d = self.dim
y=torch.zeros((d,d,x.size()[0]))

for i, l in enumerate(self.linears):
y[i,:,:] = torch.transpose(l(x[:,i].view(-1,1)),0,1)

y[d-1,1:d,:] = torch.transpose(self.final_layer(x[:,d-1].view(-1,1)),0,1)
reg=self.weight_constraint()

y[d-1,0,:]=torch.abs(reg-lambda1)*x[:,d-1]
y=torch.max(y, axis=0).values

def weight_constraint(self):
reg=0

for i, l in enumerate(self.linears):
reg+=torch.sum(l.weight)
reg+=torch.sum(self.final_layer.weight)

return reg

def custom_loss(output, target):
loss = torch.max(output/target)
return loss

np.random.seed(seed=1)
torch.manual_seed(1)

d=3
n=100
model = Network(dim=d)

C=np.array([[1,0.5,0.3],[0,1,0],[0,0,1]])

Z=np.random.lognormal( 0, 3, size=(n,d))

X=np.zeros((n,d))

for i in range(n):
for j in range(d):
X[i,j]=np.max(C[:,j]*Z[i,:])

lambda1=3+0.5+0.3

optimizer = optim.LBFGS(model.parameters(), lr=0.06)

for t in range(100000):

def closure():

x_pred = model(torch.Tensor(X))
loss  = custom_loss(x_pred, torch.Tensor(X))

loss.backward()

for i, layer in enumerate(model.linears):
model.linears[i].weight.copy_ (model.linears[i].weight.data.clamp(min=0))
model.final_layer.weight.copy_ (model.final_layer.weight.data.clamp(min=0))

print(loss)

return loss

optimizer.step(closure)

#Testing if true C matrix indeed gives lower penalty
model2=copy.deepcopy(model)

for j,l in enumerate(model2.linears):
for i in range(d):
l.weight[i]=C[j,i]
for i in range(d-1):
model2.final_layer.weight[i]=C[d-1,i+1]

x_pred=model2(torch.Tensor(X))
loss  = custom_loss(x_pred, torch.Tensor(X))
print("Loss Value for the true C Matrix: ", loss)
``````

You can see that the true loss value is just 1, but the minimum it finds is far away from 1; Let me quickly explain what I am doing;

I define (d-1) layers of size d, one layer of size (d-1) so I have exactly d^2-1 variables;

Then the forward function just tries to calculate X^est based on these layers and weight_contraint() just calculates the sum of all values of the layer;

The rest should be basic; I generate data, I run the pytorch algorithm; In the end, I test if I set up the forward function and the network correctly; I use the true C values to show that it indeed gives error 1;

Any idea how I can properly set up this weight constraint or is it impossible?