Adding L1/L2 regularization in a Convolutional Networks in PyTorch?

blade · January 14, 2023, 7:17pm

Since the L1 regularizer is not differentiable everywhere, what does PyTorch do when it encounters differentiating this functions? A simple example shows PyTorch returns zero.

import torch

x = torch.linspace(-1.0, 1.0, 5, requires_grad=True)
y = torch.abs(x)
y[2].backward()

print(x.grad)

tensor([-0., -0., 0., 0., 0.])

Why is this the case?