I am doing a research project in my undergraduate studies, and my professor requested me to implement an optimization algorithm he wrote.
The algorithm optimizes a fully connected feed-forward neural network for classification, and in order to perform a weight update, I need to calculate the gradient w.r.t. each label - meaning that if I have n classes to classify to, I need to calculate the loss n times.
This is something which is strange (and for the best of my little knowledge) impossible in the current scheme of writing optimizers in PyTorch.
Can anyone aid? Wrote a PyTorch optimizer?
If a wasn’t clear in my description of the problem, please fell free to reply and ask - it can be quite confusing to fully understand whats going on here.
Thank you so much