Quick question if the output for my network is a high dimensional vector and the resultant target value is of the same dimension, can i create a loss function such that it take a mean square error for each output node seperately and thus each output node gets its own error signal. Is this possible?
In the end, you mathematically only get a gradient of the shape of your parameters if you differentiate a scalar function.
There are cases where you want the Jacobian, ie differentiate a vector output, but those are rare enough that PyTorch does not support it as well as differentiating scalars. (But you’ll find tricks if you search for Jacobian here on the forum.)