Training when differentiates w.r.t. both input and neural network weight

I have training process follows the following high level procedures

optimizer = optim.Adam(nn.parameters(), lr=lr) # where nn is a neural network
x = data.draw() # draw some sample points
y = nn(x) # forward in neural network

dy = torch.autograd.grad(y.split(1),[x])[0] # compute the gradients w.r.t. sample x

loss = func(dy) # define loss function based on dy

However, when the calling

loss.backward() # here raise error

The error is

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Which is little weird because dy depends on y which depends nn.parameters(). This gradient should be valid exists.

is there anyway to fix it?