Prunning weights (not optimizing some part of weights)


I want to not optimize (do back propagation) some part of weights in a neural network. For example, if I have a two fully connected layers (as my neural network). Let’s say nn.Linear(784, 84) and nn.Linear(84,10). So my weights would be w1 = [84, 784], w2 = [10, 84]. I want to only optimize 1 and 2 row of w2 so other rows in w2 should not be optimized. I have checked Pytorch modules and functions, but still don’t know how to do this in Pytorch. I really appreciate your help. Thanks in advance


Maybe the pruning module with a custom mask is going to be the simplest thing you can use? doc here.


Thanks for your reply. Actually, I want to prune different part of the weights at different iterations (and put back the weights that I pruned in the previous iteration). Is there a way that I set the require grad = False for some rows of weights in an iteration? (I think that would be easier for my purpose than the prune mask)



In that case you might have to do it manually. You cannot set autograd-related properties to only a subset of a Tensor I’m afraid.
But you can for example, save the part of the weight you don’t want changed and restore it after doing the optimizer’s step.

This does not work. I want the neural network to be optimized on the pruned version, so the pruned weights do not participate neither in the feed-forward nor the back-propagation. Using the prune with custom mask, every time i should define a neural network from scratch which I wanted to avoid if possible. Hence, I was thinking of the somehow set the specific rows of the weights to zero and freeze them during training (in this case they don’t participate neither in the feed-forward nor the back-propagation). But, I don’t know how this can be implemented in pytorch.

Thanks for your help

But, I don’t know how this can be implemented in pytorch.

I think it depends a lot on the behavior you want.
But just setting part of the Tensor to 0 before the forward and again after the optimizer’s step would work just fine if it’s what you want. Or you could also set some gradients to 0 before doing the optimizer step (be careful as weight decay/momentum might make parameters with 0 grad still change).
But that depends how your pruning change with the iterations.

I think you can do just that:

for sample in dataset:
    out = model(sample)
    loss = crit(out)