Optim.sgd can't manually fix the weights?

tree5680 · May 15, 2021, 5:19am

i made a function code that makes the zero weight fixed to zero while training by making zero weight’s grad to zero.

def fix_tensor_data(model.parameters()):
    for t in model.parameters():
        if t.grad is not None:
            t_copy=t.data.abs().clone()
            mask = t_copy.gt(0).float().to(t.grad.device)
            t.grad.mul_(mask)  
            Lgt=torch.stack([t.grad])             
    return Lgt

and i check that optimizer’s grad have a zero grad after applying this function. but zero weights are changed when i run ‘optimizer.step()’

the optimizer’s code used in weight updating below
(in DARTS which is autoML model)

  optimizer = torch.optim.SGD(
      model.parameters(),
      args.learning_rate,
      momentum=args.momentum,
      weight_decay=args.weight_decay)

i added a ‘fix_tensor_data’ before optimizer.step() and checked that
zero grads are in optimizer’s param after applying ‘fix_tensor_data’.

    fix_tensor_data(model.parameters())
     
    print(optimizer.param_groups[0]["params"][0].grad)
    optimizer.step()

but when i run the optimizer.step() and print model.parameters(), there are no zero weight in parameter tensors.

so i find step() library in pytorch but i can’t get it…

what is the problem of my code?..

i really need a help… thank u for reading my question!

ptrblck · May 15, 2021, 7:12am

If you are using a momentum in the optimizer and had previous valid updates for these parameters, they’ll still be updated as seen here:

# no momentum
x = torch.ones(1, 5, requires_grad=True)
optimizer = torch.optim.SGD([x], lr=0.1, momentum=0.)

# perform a valid update
print('no momentum, valid update')
x.grad = torch.ones_like(x)
print(x)
optimizer.step()
print(x)

# zero out gradients and check, if parameter is still updated
for _ in range(3):
    print('no momentum, update with zero grad')
    x.grad = torch.zeros_like(x)
    print(x)
    optimizer.step()
    print(x)

# with momentum
optimizer = torch.optim.SGD([x], lr=0.1, momentum=0.1)

# perform a valid update
print('momentum, valid update')
x.grad = torch.ones_like(x)
print(x)
optimizer.step()
print(x)

# zero out gradients and check, if parameter is still updated
for _ in range(3):
    print('momentum, update with zero grad')
    x.grad = torch.zeros_like(x)
    print(x)
    optimizer.step()
    print(x)

Output:

no momentum, valid update
tensor([[1., 1., 1., 1., 1.]], requires_grad=True)
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
no momentum, update with zero grad
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
no momentum, update with zero grad
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
no momentum, update with zero grad
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
momentum, valid update
tensor([[0.9000, 0.9000, 0.9000, 0.9000, 0.9000]], requires_grad=True)
tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000]], requires_grad=True)
momentum, update with zero grad
tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000]], requires_grad=True)
tensor([[0.7900, 0.7900, 0.7900, 0.7900, 0.7900]], requires_grad=True)
momentum, update with zero grad
tensor([[0.7900, 0.7900, 0.7900, 0.7900, 0.7900]], requires_grad=True)
tensor([[0.7890, 0.7890, 0.7890, 0.7890, 0.7890]], requires_grad=True)
momentum, update with zero grad
tensor([[0.7890, 0.7890, 0.7890, 0.7890, 0.7890]], requires_grad=True)
tensor([[0.7889, 0.7889, 0.7889, 0.7889, 0.7889]], requires_grad=True)

tree5680 · May 15, 2021, 8:18am

thank you for kind command! i’ll try it with 0 momentum!