Prunning weights (not optimizing some part of weights)

mahdi_morafah · August 19, 2020, 4:47pm

Hello,

I want to not optimize (do back propagation) some part of weights in a neural network. For example, if I have a two fully connected layers (as my neural network). Let’s say nn.Linear(784, 84) and nn.Linear(84,10). So my weights would be w1 = [84, 784], w2 = [10, 84]. I want to only optimize 1 and 2 row of w2 so other rows in w2 should not be optimized. I have checked Pytorch modules and functions, but still don’t know how to do this in Pytorch. I really appreciate your help. Thanks in advance

albanD · August 19, 2020, 4:49pm

Hi,

Maybe the pruning module with a custom mask is going to be the simplest thing you can use? doc here.

mahdi_morafah · August 19, 2020, 4:57pm

Hi,

Thanks for your reply. Actually, I want to prune different part of the weights at different iterations (and put back the weights that I pruned in the previous iteration). Is there a way that I set the require grad = False for some rows of weights in an iteration? (I think that would be easier for my purpose than the prune mask)

Thanks,

albanD · August 19, 2020, 5:24pm

Hi,

In that case you might have to do it manually. You cannot set autograd-related properties to only a subset of a Tensor I’m afraid.
But you can for example, save the part of the weight you don’t want changed and restore it after doing the optimizer’s step.

mahdi_morafah · August 19, 2020, 8:31pm

This does not work. I want the neural network to be optimized on the pruned version, so the pruned weights do not participate neither in the feed-forward nor the back-propagation. Using the prune with custom mask, every time i should define a neural network from scratch which I wanted to avoid if possible. Hence, I was thinking of the somehow set the specific rows of the weights to zero and freeze them during training (in this case they don’t participate neither in the feed-forward nor the back-propagation). But, I don’t know how this can be implemented in pytorch.

Thanks for your help

albanD · August 19, 2020, 9:07pm

But, I don’t know how this can be implemented in pytorch.

I think it depends a lot on the behavior you want.
But just setting part of the Tensor to 0 before the forward and again after the optimizer’s step would work just fine if it’s what you want. Or you could also set some gradients to 0 before doing the optimizer step (be careful as weight decay/momentum might make parameters with 0 grad still change).
But that depends how your pruning change with the iterations.

I think you can do just that:

for sample in dataset:
    model.prune_some_weights()
    out = model(sample)
    loss = crit(out)
    opt.zero_grad()
    loss.backward()
    opt.step()
    model.prune_some_weights()