How to build LSTM of alternative outputs?

I encountered a paper named ‘Efficient Neural Architecture Search via Parameter Sharing’ and I want to implement the network in pytorch.

However, after I read the paper, I found that it adopts a LSTM whose outputs change alternatively. For instance, if a letter is fed to LSTM, then it outputs sth representing numbers, if a number (formatted as string) is fed to LSTM, then it outputs sth representing letters.

My question is:

  1. Is it possible for LSTM behaves like this without any training?
  2. If not (I guess), then is it possible to build one by combining two LSTMs whose word embedding constrained to letters and digits respectively? (The built LSTM should behave as required without any training)