CNN weights pruning methods in PyTorch

minhoha · October 30, 2017, 4:24am

Hi.

I wanna implement network pruning using PyTorch.

I made a weight histogram to find out pruning point.

And then make weight, which can be pruned by histogram, zero.

At this point I have a few question.

Is there any method to make weights zero? I made a for 2~4 loop… Here is my code

 for m in net.modules():
 if isinstance(m, nn.Linear):
     wv_fc = m.weight.data

     if number_wv == 4:
         print("pruning FC1 weights")
         number = 0
         for n in wv_fc:
             for i in range(0, 100):
                 for j in range(0, 64*3*3):
                     if m.weight.data[i][j] <= 0.0404 and m.weight.data[i][j] >= -0.0404:
                         m.weight.data[i][j] = 0
                     print(m.weight.data[i][j])                             
             number = number + 1
         number_wv = number_wv + 1

     else:
         print("pruning FC2 weights")
         number = 0
         for n in wv_fc:
             for i in range(0, 10):
                 for j in range(0, 100):
                     if m.weight.data[i][j] <= 0.0404 and m.weight.data[i][j] >= -0.0404:
                         m.weight.data[i][j] = 0
                     print(m.weight.data[i][j])
             number = number + 1
         number_wv = number_wv + 1

If I done masking, is there any simple method to freeze the zero weight specifically? Not per layer like "reqires_grad=FALSE"
I wanna freeze only zero weights in entire network.
In this context, freeze means that freezed weights cannot be trained anymore.

SpandanMadan · October 30, 2017, 4:49am

You can access parameters by doing a .parameters()

So, model.children() gives you layers, and for each layer, layer.parameters() gives you access to each parameter. It’s a generator basically so it lets you run in a loop. Set the parameters to 0 and the requires_grad to False for any nodes you want to prune.

minhoha · October 30, 2017, 4:50am

can you explain with small example?

SpandanMadan · October 30, 2017, 4:57am

I have a tutorial which shows this in great detail. It doesn’t set the weight to zero and requires grad to False, but shows how to access the parameters. Here’s the github repo for the tutorial - https://github.com/Spandan-Madan/A-Collection-of-important-tasks-in-pytorch

In the tutorial go to cell 22 and modify the lines to be.

for child in model.children():
    for param in child.parameters():
        param =  torch.zeros(param.size())
        param.requires_grad = False

This will basically prune all nodes of the model. To specify which nodes are to be pruned just add if statements in the above code to prune only the required nodes!

Hope this helps!

minhoha · October 30, 2017, 7:44am

I think it is impossible to configure specific weights. The tutorial what you recommend is also about just layer configuration.

minhoha · October 30, 2017, 9:40am

I modified the code like that

for child in net.children():
for param in child.layer[0].parameters():
    for i in range(0,16):
        for j in range(0,1):
            for k in range(0,5):
                for l in range(0,5):
                    if param.data[i][j][k][l] <= 0.0404 and param.data[i][j][k][l] >= -0.0404:
                        param.data[i][j][k][l] = 0
                        param[i][j][k][l].reguired_grand = False

But I cause other error like that

Traceback (most recent call last):
  File "main3.py", line 229, in <module>
    param[i][j][k][l].requires_grad = False
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().

If i use detach(), there is no change…

SpandanMadan · October 30, 2017, 9:43am

You’ll need to use hooks to get access to intermediate gradients. Read this for an example - Why cant I see .grad of an intermediate variable?

minhoha · October 30, 2017, 9:55am

You means that make gradients zero. right?

SpandanMadan · October 30, 2017, 6:34pm

That’s one way to do it. It’s a little inefficient as you’re still computing things just that they’re zero so they don’t add up to the equation. But yes it makes sense because it’s obviously impossible to drop specific elements of a matrix from the computation graph. If you get this to work, do drop a reply below I’m curious if this can be done

TonightHitTiger1 · April 3, 2019, 11:32am

Hi，I have the same problem as you. How do you solve it in the end?

AlexKoff88 · August 26, 2019, 7:24am

Hi, look at what is available here https://github.com/opencv/openvino_training_extensions/tree/develop/pytorch_toolkit/nncf. There are two pruning methods and quantization support.