Which input to use when generating a new sequence

I want to use sequence-to-sequence architecture to generate sequences.

My data has such structure

[0, 0, 1, 0, ..., 0, 1] --> [12.34, 0.78, 1.54, 6.90, ..., 5.32]

I follow this tutorial to achieve it.

After forwarding through Encoder network encoder_hidden is used as a decoder_hidden. But what should I use as a first decoder_input to Decoder network?

The original tutorial uses a Start Of the Sequence token, but I can’t use it because it is encoded as 0. Probably 0 as a number will give some additional information for decoder.

I think the SOS token can be any positive number; the network will learn the number.

Apart from that I have no idea what you’re trying to learn here…

You can use any number x, it only as to adhere to the following constraints

  • 0 <= x <= (M-1) with M being size of your vocabulary + number of special tokens. For example, you have a vocabulary of size 10,000 and have 4 special tokens (very common: <SOS>, <EOS>, <PAD>, <UKN> for start of sequence, end of sequence, padding and unknown tokens). So x can be 0, 1, 2, …10,003.

  • x cannot be taken by a word from you vocabulary. My vocab2idx mapping usually looks like {'<PAD>': 0, <UKN>': 1, '<SOS>': 2, '<EOS>': 3, 'the': 4, 'a': 5, 'is':6, ...}. So my start token would be 2

1 Like