How to optimize the term with "gradient wrt input"?

Hi there,

I encounter with this problem, how to compute:

image

in pytorch? Here, X is the model input, Y is the prediction, \theta is the parameter.

Thanks in advance~