I’m currently working on face recognition project which needs to write multiple loss functions.

For the first loss, I use optim.SGD(model.parameters(),…) with cross-entropy loss, the parameters of the network will be autograded.

For the second loss, I would like to use my own function which should first sends the gradient to the last fully connected layer(model.classifier) and also backpropagate the gradients to all the other layers, should I also use optim.SGD(model.parameters(),…) for this loss function? or I can use optim.SGD(model.classifier.parameters(),…) instead?

Thanks in advance!!