Learning best 'label smoothing' values

Consider a simple use of label-smoothing, on MNIST. You might use the label [0.01, 0.91, 0.01, …, 0.01] for the class ‘1’, similarly for the other 9 classes. I have an idea I would like to try out - have the network learn the best label for each sample. I think this would require back prop thru and updates to the actual labels. How can I set things up so that this occurs?