Are there any standard / recommended SOS and EOS tokens for seq2seq encoding using RNN / LSTM / Transformers applied to real-valued and/or complex-valued 1D signals (with few samples, i.e. <100 samples per input signal to be encoded) ? The encoding being generated for signal discrimination, in other words we are generating fixed-size 1D signal embeddings for classification.
I have a hard time finding good references for this. I know there are wav2vec (with convolutions) and wav2vec 2.0 (with convolutions and transformer), but I’m not sure such approaches are adapted for short signals never exceeding say a hundred samples. I also had a look at articles, such as “LSTM-based auto-encoder model for ECG arrhythmias classification”, without finding a proper explanation for the choice of SOS and EOS values. Perhaps I missed a typical case (like for a real-valued signal in [0,1], SOS could be 2 and EOS -1 ?).
I would definitely be also interested in a more elaborate encoding reference where the encoding network takes into account a sampling frequency diversity in the input signals, in addition to clearly stating which SOS and EOS are used and why.
Acronyms suggested in this PyTorch tutorial:
SOS: start of sequence token
EOS: end of sequence token
EDIT (possible answer): It may be that SOS and EOS are not necessary for fixed-sized signal reconstruction by a seq2seq, i.e. EOS (and SOS ?) is necessary when the seq2seq is applied to translation, where the task is also about learning to predict the translated sentence size / choosing when to stop the sentence. In contrast, fixed-sized signal reconstruction discards learning to predict the output sequence size. I would still gladly receive indications on references developing such intuitions.
(I post in uncategorized since this seems to be neither an “audio” nor a “data” category question)