Yes, sure, thank you.
Continue my codes on the original post.
mask_weights.register_hook(print)
z = torch.Tensor([[1], [1]])
# tensor([[ 1.],
# [ 1.]])
out = (y-z).mean()
# tensor(-0.6595)
out.backward()
# tensor([[ 0.1920, 0.1757, 0.0046],
# [ 0.1920, 0.1757, 0.0046]])
weights.grad
# tensor([[ 0.0000, 0.1757, 0.0000],
# [ 0.1920, 0.0000, 0.0046]])
As you can see, the value of gradients of mask_weights
are not masked.