Kernel Constraint similar to the one implemented in Keras

ilyes · July 7, 2019, 4:02pm

Hi,

I want to add a constraint (max_norm) to my 2D convolutional layer’s weights. For example we can do that easily in Keras using:

keras.layers.Conv2D(8, (3, 2), activation='relu', kernel_constraint=max_norm(1.))

which makes a convolutional layer with 8 kernels each one has a size of (3, 2).

Is there a way to do the same in Pytorch? i searched in the forum but can’t find something relevant.

Thank you

tom · July 8, 2019, 5:44am

The usual way to do this is to use the functional interface to redefine the forward

class ConstrainedConv2d(nn.Conv2d):
    def forward(self, input):
        return F.conv2d(input, self.weight.clamp(min=-1.0, max=1.0), self.bias, self.stride,
                        self.padding, self.dilation, self.groups)

The important bit here being the .clamp. For spectral norm, there is a utility in torch.nn.utils.
In IPython, you can get the forward of Conv2d using ?? nn.Conv2d to ensure feature parity (e.g. I left out the padding_mode).

Best regards

Thomas

ilyes · July 8, 2019, 8:58am

Thank you! still have a problem with the exact implementation. Let’s say I have a code as follows:

class Model (nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.H  = 200
        self.W = 200
        self.conv = nn.Conv2d(1, 8, (1, 101), padding = (0, 50), bias = False)
        self.fc1 = nn.Linear(self.H * self.W , 1)        
    def forward(self, x):
      
        x = self.conv(x)
        x = F.relu(x)
        x = x.view(-1, self.H * self.W)
        x = torch.sigmoid(self.fc1(x))
        return x

Now Do I have to implement the ConstrainedConv2d like you did

class ConstrainedConv2d(nn.Conv2d):
    def forward(self, input):
        return F.conv2d(input, self.weight.clamp(min=-1.0, max=1.0), self.bias, self.stride,
                        self.padding, self.dilation, self.groups)

and replace my conv2D defined in the init by ConstrainedConv2D ? and if so how do I pass the the desired parameters to ConstrainedConv2D (input channel, outputChannel, kernel size …)

tom · July 8, 2019, 9:21am

ConstrainedConv2d inherits it’s __init__ from nn.Conv2d, so you can use it as a drop-in replacement.

Best regards

Thomas

ilyes · July 8, 2019, 9:43am

Got it! Thank you Thomas

lkc · September 18, 2020, 6:27pm

Wouldn’t this constrain the weights to be between -1.0 and 1.0? In Keras the max_norm constraint is on the norm itself and weights are adjusted proportionally. Something along the following lines

    def _max_norm(self, w):
        norm = w.norm(2, dim=0, keepdim=True)
        desired = torch.clamp(norm, 0, self._max_norm_val)
        return w * (desired / (self._eps + norm))

This is taken from this post.

Max Norm Constraint

If a hidden unit’s weight vector’s L2 norm L ever gets bigger than a certain max value c , multiply the weight vector by c/L . Enforce it immediately after each weight vector update or after every X gradient update.

This constraint is another form of regularization. While L2 penalizes high weights using the loss function, “max norm” acts directly on the weights. L2 exerts a constant pressure to move the weights near zero which could throw away useful information when the loss function doesn’t provide incentive for the weights to remain far from zero. On the other hand, “max norm” never drives the weights to near zero. As long as the norm is less than the constraint value, the constraint has no effect.

Since the OP accepted the given answer, the weight constraint probably solved their problem. But Just for future references, I think max_norm is different than clamping the weight directly. Let me know if I missed something here.