Is it possible to train few neurons in last fully connected layers of any mode in PyTorch? For example, if the last two layers are 256,128 in size and output has 10 nodes. Then, is it possible to train last 128 neurons of 256-size layer, last 64 neurons in 128-size layer and all 10 neurons of the output?
You could manually zero out the weights which you do not want to train.
Note however, that weight decay (or other regularization terms) could still update parameters with a zero gradient.