Weights disconnection implementation

Yes, sure, thank you.

Continue my codes on the original post.

mask_weights.register_hook(print)

z = torch.Tensor([[1], [1]])
# tensor([[ 1.],
#         [ 1.]])

out = (y-z).mean()
# tensor(-0.6595)

out.backward()
# tensor([[ 0.1920,  0.1757,  0.0046],
#         [ 0.1920,  0.1757,  0.0046]])

weights.grad
# tensor([[ 0.0000,  0.1757,  0.0000],
#         [ 0.1920,  0.0000,  0.0046]])

As you can see, the value of gradients of mask_weights are not masked.