Gradient of output of network with respect to parameters

I need to calculate the gradient of output of network with respect to the parameters of network ( say with respect to first layer weights). Any suggestion how I can do this. I am doing 3-class classification so output of network is a tensor of length 3.

Usually this is done with some criterion like mean-squared error or cross-entropy loss and an optimizer like stochastic gradient descent.

The MNIST example is a good starting point for a typical training loop: examples/ at 2639cf050493df9d3cbf065d45e6025733add0f4 · pytorch/examples · GitHub

Yeah, I am working on the case where I need to calculate gradient of output ( not some loss function) wrt parameters. Is there any way I can do this.

What happens if you simply define a loss function that is the identity with respect to the output? (assuming that the output is a scalar)

So my output is not scalar. Is that case is there any function which I can use.

In case you want to compute something like a Jacobian, you can take a look at Automatic differentiation package - torch.autograd — PyTorch master documentation

torch.autograd.functional.jacobian works.
Thanks for the reply.