Constrained PReLU

I would like to use a 1-Lipschitz continuous function as an activation function in my network, such as ReLU and LeakyReLU (with negative slop between 0 and 1).

I also would like to use an activation function similar to PReLU with a learnable parameter while having the 1-Lipschitz continuous property. Are there any ways that I can add a constraint to the torch.nn.PReLU such that the learnable parameter is bound between 0 and 1?

Hi Frankie!

Probably the best way is to train an unbounded parameter, but transform
it into a bounded prelu-weight with sigmoid():

>>> import torch
>>> torch.__version__
'1.13.1'
>>> _ = torch.manual_seed (2023)
>>> wt = torch.randn (1, requires_grad = True)   # learnable unbounded weight
>>> wt_bnd = torch.sigmoid (wt)                  # derived bounded weight
>>> loss = torch.nn.functional.prelu (torch.randn (5), wt_bnd).sum()
>>> loss.backward()
>>> wt.grad
tensor([-0.1999])

wt_bnd, the weight parameter passed to the functional version of prelu(),
is bounded between 0 and 1, while the actual trainable parameter, wt, is
unbounded, so you can train it without needing to add constraints somehow
to the optimizer.

Best.

K. Frank

Thanks for your suggestions. To me, it is a normalization of the learnable parameter, but can we say now our activation function is 1-Lipschitz continuous?

Hi Frankie!

I’m not sure what you are asking.

If you’re asking whether our “constrained prelu” is Lipschitz continuous
when viewed as a function of just its input (with its weight parameter
held fixed), the answer is yes. By inspection prelu (input) is Lipschitz
continuous.

If your asking whether prelu (input, weight) is Lipschitz continuous
when understood as a function of two variables, even though the weight
we pass in is bounded,
the answer is no. For input < 0,
prelu (input, weight) = weight * input, so the partial derivative,
d prelu (input, weight) / d weight = input, is unbounded
because input can run off to inf and -inf.

Best.

K. Frank

Hi Frank,

Thanks for your answers. I was trying to ask whether our “constrained prelu” is Lipschitz continuous. It is all clear to me now.

Thanks for all your supports.

Best,

Frankie