I can’t see a way of doing it. What is the simplest way to constrain the weights of a 272 layers of 33 x 33 filters and each layer to [0,1]
or [-1,1]
during training.
Hi Sze!
There is really no sensible way to precisely constrain parameters
during the training process. At issue is that pytorch is based on
various gradient-descent optimization techniques that evolve
parameters “continuously” (of course, actually, with finite, but
presumably small steps). So there’s no “continuous” way for a
parameter to have its value jump from, say, 0
to 1
.
Probably the best approach is to add a penalty to your loss that
pushes the parameter values toward 0.0
and 1.0
, e.g., something
like:
constraint_penalty = alpha * (weight * (weight - 1.0))**2
(For alpha > 0.0
this penalty is exactly zero only when each element
of weight
is either 0.0
or 1.0
, and is otherwise greater than zero.
You could also use .abs()
rather than **2
; you might try both.)
Then start your training with alpha = 0.0
so your weights can settle
into the neighborhoods of values that give you good predictions. Then
keep training while increasing alpha
. When alpha
becomes large
enough, constraint_penalty
will dominate, and your weights will
become “frozen” at 0.0
and 1.0
.
Best.
K. Frank