# Is there a better way to generate monotone positive trainable weights?

Hi , I was trying to generate a group of trainable monotone positive tensors as the weights for Conv1d. It actually works to some extent, but the problem is the real trainable parameter “alpha”/“beta” changed very slightly compared with orthodox convolution weights. They just fluctuated around the initiate value. So I suspect the grads are not very friendly to train due to the way I create those weights. Just curious if there is better way to handle it. Thanks!

``````class Fir1d(nn.Module):
def __init__(self, k_size, device, init_alpha, init_beta=0):
super().__init__()
# every channel uses same kernel weights
assert k_size % 2 == 1
m = (k_size-1)//2
self.lin = torch.linspace(-m, m, k_size, device=device, requires_grad=False)
self.alpha = nn.Parameter(torch.tensor([init_alpha], device=device, dtype=torch.float32, requires_grad=True))
self.beta = nn.Parameter(torch.tensor([init_beta], device=device, dtype=torch.float32, requires_grad=True))
self.w = torch.softmax(self.lin * self.alpha + self.beta, dim=0).unsqueeze(0).unsqueeze(0)  # 1,1,k

def forward(self, x):
# b,c,l
xlist = []
B,C,L = x.shape
for i in range(C):
subx = x[:, [i], :]
main_trend = F.conv1d(subx, self.w)
xlist.append(main_trend)
xlist = torch.cat(xlist, dim=1)
return xlist``````

Hi Ximeng!

Am I correct that you want the individual weight-values in your `conv1d()`
kernel to be positive and monotonically increasing, even as they train?

This won’t work.

You are only calling:

``````self.w = torch.softmax(self.lin * self.alpha + self.beta, dim=0).unsqueeze(0).unsqueeze(0)
``````

once, when you initialize your `Fir1d` model. Even though you update `alpha`
and `beta` when you run your optimize step, you never recompute `w` (your
kernel weights), so changing the values of `alpha` and `beta` doesn’t actually
do anything.

Recompute `w` inside of `forward()` (and just have it be a local variable of
`Fir1d`'s `forward()` method), e.g.:

``````        ...
w = torch.softmax(self.lin * self.alpha + self.beta, dim=0).unsqueeze(0).unsqueeze(0)
for i in range(C):
subx = x[:, [i], :]
main_trend = F.conv1d (subx, w)
xlist.append(main_trend)
xlist = torch.cat(xlist, dim=1)
return xlist
``````

or, probably more efficiently, without the loop:

``````    def forward(self, x):
w = torch.softmax (self.lin * self.alpha, dim=0).unsqueeze(0).expand (C, 1, self.lin.size (0))
return F.conv1d (x, w, groups = C)
``````

I left `self.beta` out of the loop-free version because it doesn’t do anything.
`softmax()` takes “raw-score” logits that it then, in effect, “normalizes,” causing
`self.beta` to drop out of the `softmax()` computation.

So (even if you leave `beta` in) you will be training only one kernel-weight
parameter, `alpha`.

As an aside, if you wanted your kernel weights to depend on more trainable
parameters, while still being positive and increasing monotonically, you could:

``````    def __init__(self, k_size, device, init_alpha, init_beta=0):
...
self.kernel_parameters = nn.Parameter (torch.zeros (k_size))   # initial value

def forward(self, x):
w = self.kernel_parameters.exp().cumsum (0).unsqueeze (0).expand (C, 1, kernel_parameters.size (0))
return F.conv1d (x, w, groups = C)
``````

Your raw `kernel_parameters` run from `-inf` to `inf`. `.exp()` will cause your
derived weights, `w`, to be positive, and `.cumsum()` will cause your derived
weights to be monotonically increasing.

Best.

K. Frank

Thank you Frank! You are so professional. Every time you helped me a lot XD.