How to minimize gradient wrt inputs norm?

Please help. I have a simple net which takes several inputs, and outputs a single value. I can compute gradient of the output wrt to all the inputs by autograd. Now I want to minimize the norm of this gradient wrt to the model parameters… Is it possible? I remember it was possible to do with Tensorflow 1.x, but not so sure about Pytorch. The sample code would be something like

    x.requires_grad_()
    y = Model(x)
    y.backward()
    loss = x.grad.square().sum()
    loss.backward()

Thanks!