How to calculate MSE loss with respect to input gradient?

What is the best way to optimize \theta for the the following loss function:
Screenshot from 2022-04-05 15-08-29

I tried the following but does not work:

optimizer.zero_grad()
Input.requires_grad_()
Output = Model(Input)
Output_max = Output[0,target1]
Output_max.backward(retain_graph = True)
loss = criterion(Input.grad, target)
loss.backward()
optimizer.step()

I get the following error for loss.backword()

element 0 of tensors does not require grad and does not have a grad_fn