I am trying to encode the alphabet into a numerical format and it appears there are two ways to go about this. Using class labels (0 1 2 … 25) or using a one-hot vector with each colum being the 26 letters. From the what I’ve read online, it would make sense to use a one-hot vector as there is no ordinality in letters e.g. A is not ‘close’ to B. However I am struggling to use one-hot vectors with PyTorch because all of the loss functions only take class labels. It is easy to convert a one-hot vector to class labels to be able to pass it to the loss functions, however would this hurt performance of the network? This is the architecture problem I am facing.
It may be worth metioning my situation. This is a many to one seq network using RNN layers and eventually the network goes through a linear layer with an output shape of 26.