CNN weights pruning methods in PyTorch

Hi.

I wanna implement network pruning using PyTorch.

I made a weight histogram to find out pruning point.

And then make weight, which can be pruned by histogram, zero.

At this point I have a few question.

  1. Is there any method to make weights zero? I made a for 2~4 loop… Here is my code

     for m in net.modules():
     if isinstance(m, nn.Linear):
         wv_fc = m.weight.data
    
         if number_wv == 4:
             print("pruning FC1 weights")
             number = 0
             for n in wv_fc:
                 for i in range(0, 100):
                     for j in range(0, 64*3*3):
                         if m.weight.data[i][j] <= 0.0404 and m.weight.data[i][j] >= -0.0404:
                             m.weight.data[i][j] = 0
                         print(m.weight.data[i][j])                             
                 number = number + 1
             number_wv = number_wv + 1
    
         else:
             print("pruning FC2 weights")
             number = 0
             for n in wv_fc:
                 for i in range(0, 10):
                     for j in range(0, 100):
                         if m.weight.data[i][j] <= 0.0404 and m.weight.data[i][j] >= -0.0404:
                             m.weight.data[i][j] = 0
                         print(m.weight.data[i][j])
                 number = number + 1
             number_wv = number_wv + 1
    
  2. If I done masking, is there any simple method to freeze the zero weight specifically? Not per layer like "reqires_grad=FALSE"
    I wanna freeze only zero weights in entire network.
    In this context, freeze means that freezed weights cannot be trained anymore.

You can access parameters by doing a .parameters()

So, model.children() gives you layers, and for each layer, layer.parameters() gives you access to each parameter. It’s a generator basically so it lets you run in a loop. Set the parameters to 0 and the requires_grad to False for any nodes you want to prune.

can you explain with small example?

I have a tutorial which shows this in great detail. It doesn’t set the weight to zero and requires grad to False, but shows how to access the parameters. Here’s the github repo for the tutorial - https://github.com/Spandan-Madan/A-Collection-of-important-tasks-in-pytorch

In the tutorial go to cell 22 and modify the lines to be.

for child in model.children():
    for param in child.parameters():
        param =  torch.zeros(param.size())
        param.requires_grad = False

This will basically prune all nodes of the model. To specify which nodes are to be pruned just add if statements in the above code to prune only the required nodes!

Hope this helps!

1 Like

I think it is impossible to configure specific weights. The tutorial what you recommend is also about just layer configuration.

I modified the code like that

for child in net.children():
for param in child.layer[0].parameters():
    for i in range(0,16):
        for j in range(0,1):
            for k in range(0,5):
                for l in range(0,5):
                    if param.data[i][j][k][l] <= 0.0404 and param.data[i][j][k][l] >= -0.0404:
                        param.data[i][j][k][l] = 0
                        param[i][j][k][l].reguired_grand = False

But I cause other error like that

Traceback (most recent call last):
  File "main3.py", line 229, in <module>
    param[i][j][k][l].requires_grad = False
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().

If i use detach(), there is no change…

You’ll need to use hooks to get access to intermediate gradients. Read this for an example - Why cant I see .grad of an intermediate variable?

You means that make gradients zero. right?

That’s one way to do it. It’s a little inefficient as you’re still computing things just that they’re zero so they don’t add up to the equation. But yes it makes sense because it’s obviously impossible to drop specific elements of a matrix from the computation graph. If you get this to work, do drop a reply below I’m curious if this can be done :slight_smile:

Hi,I have the same problem as you. How do you solve it in the end?

Hi, look at what is available here https://github.com/opencv/openvino_training_extensions/tree/develop/pytorch_toolkit/nncf. There are two pruning methods and quantization support.