I want to add l1-regularization term to loss, but I am little bit confused.
When I print the parameters’ shape of CNN, each of them has different dimension.
>>[print(p.shape) for p in model.parameters()]
torch.Size([64, 3, 3, 3])
torch.Size([64])
torch.Size([64])
torch.Size([64, 64, 3, 3])
torch.Size([64])
torch.Size([64])
torch.Size([64, 64, 3, 3])
torch.Size([64])
torch.Size([64])
torch.Size([64, 64, 3, 3])
torch.Size([64])
torch.Size([64])
torch.Size([64, 64, 3, 3])
torch.Size([64])
torch.Size([64])
torch.Size([128, 64, 3, 3])
torch.Size([128])
torch.Size([128])
torch.Size([128, 128, 3, 3])
torch.Size([128])
torch.Size([128])
torch.Size([128, 64, 1, 1])
torch.Size([128])
torch.Size([128])
torch.Size([128, 128, 3, 3])
...
1) Use torch.linalg.norm
cost += lambda * sum(torch.linalg.norm(p,ord=1) for p in model.parameters())
When I use torch.linalg.norm I have to choose dim.
>>sum(torch.linalg.norm(p,1) for p in model.parameters())
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "<string>", line 1, in <genexpr>
RuntimeError: 'dim' must specify 1 or 2 dimensions when order is numerical and input is not 1-D or 2-D
But when I choose dimension as 0, each output has different shape so summation is impossible.
>>sum(torch.linalg.norm(p,1,0) for p in model.parameters())
Traceback (most recent call last):
File "<string>", line 1, in <module>
RuntimeError: The size of tensor a (3) must match the size of tensor b (64) at non-singleton dimension 0
How can I add l1-regularization term to loss using torch.linalg.norm?
2) Use torch.tensor.norm
cost += lambda * sum(p.norm(1) for p in model.parameters())
It works but its result value is very large. Is it right way to add l1-regularization term to loss?
>>sum(p.norm(1) for p in model.parameters())
tensor(111630.8516, device='cuda:0', grad_fn=<AddBackward0>)
3) What’s the difference between torch.linalg.norm and torch.tensor.norm?