Torch.autograd.grad and masking issue

Hello I have following issue, which I cannot solve, but I understand main idea behind issue.
So, I have following tensors:
point: shape(N,3), require_grad=True
output: shape(N,) require_grad=True
mask: shape(N,) require_grad=False, dtype=bool

Output is function of point.

This works:
grad = torch.autograd.grad(output, point)

This does not work:
grad = torch.autograd.grad(output[mask], point[mask])

So it seems this masking breaks computation graph - i.e. output[mask] and point[mask] are not connected in computation graph.So I have following questions:

  1. Does masking creates copy of tensor, which does not have any connection to original tensor? And how is masking related to computation graph
  2. How can I achieve this code works with masking? I am trying but not able to get it

Thank you!