I made a weight histogram to find out pruning point.
And then make weight, which can be pruned by histogram, zero.
At this point I have a few question.
Is there any method to make weights zero? I made a for 2~4 loop… Here is my code
for m in net.modules():
if isinstance(m, nn.Linear):
wv_fc = m.weight.data
if number_wv == 4:
print("pruning FC1 weights")
number = 0
for n in wv_fc:
for i in range(0, 100):
for j in range(0, 64*3*3):
if m.weight.data[i][j] <= 0.0404 and m.weight.data[i][j] >= -0.0404:
m.weight.data[i][j] = 0
print(m.weight.data[i][j])
number = number + 1
number_wv = number_wv + 1
else:
print("pruning FC2 weights")
number = 0
for n in wv_fc:
for i in range(0, 10):
for j in range(0, 100):
if m.weight.data[i][j] <= 0.0404 and m.weight.data[i][j] >= -0.0404:
m.weight.data[i][j] = 0
print(m.weight.data[i][j])
number = number + 1
number_wv = number_wv + 1
If I done masking, is there any simple method to freeze the zero weight specifically? Not per layer like "reqires_grad=FALSE"
I wanna freeze only zero weights in entire network.
In this context, freeze means that freezed weights cannot be trained anymore.
You can access parameters by doing a .parameters()
So, model.children() gives you layers, and for each layer, layer.parameters() gives you access to each parameter. It’s a generator basically so it lets you run in a loop. Set the parameters to 0 and the requires_grad to False for any nodes you want to prune.
In the tutorial go to cell 22 and modify the lines to be.
for child in model.children():
for param in child.parameters():
param = torch.zeros(param.size())
param.requires_grad = False
This will basically prune all nodes of the model. To specify which nodes are to be pruned just add if statements in the above code to prune only the required nodes!
for child in net.children():
for param in child.layer[0].parameters():
for i in range(0,16):
for j in range(0,1):
for k in range(0,5):
for l in range(0,5):
if param.data[i][j][k][l] <= 0.0404 and param.data[i][j][k][l] >= -0.0404:
param.data[i][j][k][l] = 0
param[i][j][k][l].reguired_grand = False
But I cause other error like that
Traceback (most recent call last):
File "main3.py", line 229, in <module>
param[i][j][k][l].requires_grad = False
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
That’s one way to do it. It’s a little inefficient as you’re still computing things just that they’re zero so they don’t add up to the equation. But yes it makes sense because it’s obviously impossible to drop specific elements of a matrix from the computation graph. If you get this to work, do drop a reply below I’m curious if this can be done