Customize Embedding layer in Pytorch Transformer

rubenszmm · February 11, 2021, 2:52pm

Hi guys, I followed the Harvard Annotated Transformer at Annotated Transformer and everything runs ok with text and integers. However, as the Transformer is an autoregressive model, I’d like to bypass the Embedding layer, given that it only accepts .long() data type (integers) and I have float data for Time Series forecasting. I searched for nn.Embedding source code in PyTorch but I guess it is written in C. How can I customize the nn.Embedding in a way it keeps the original input data as output or accepts floats as inputs without integer requirements/transformations?

Here is the class being used.

class Embeddings(nn.Module):
    def __init__(self, d_model, vocab):
        super(Embeddings, self).__init__()
        self.lut = nn.Embedding(vocab, d_model)         #here
        self.d_model = d_model

    def forward(self, x):
        return self.lut(x) * math.sqrt(self.d_model)

Ahead of the code, we have:

def make_model(src_vocab, tgt_vocab, N=6, 
               d_model=512, d_ff=2048, h=8, dropout=0.1):
    "Helper: Construct a model from hyperparameters."
    c = copy.deepcopy
    attn = MultiHeadedAttention(h, d_model)
    ff = PositionwiseFeedForward(d_model, d_ff, dropout)
    position = PositionalEncoding(d_model, dropout)
    model = EncoderDecoder(
        Encoder(EncoderLayer(d_model, c(attn), c(ff), dropout), N),
        Decoder(DecoderLayer(d_model, c(attn), c(attn), 
                             c(ff), dropout), N),
        nn.Sequential(Embeddings(d_model, src_vocab), c(position)),
        nn.Sequential(Embeddings(d_model, tgt_vocab), c(position)),
        Generator(d_model, tgt_vocab))
    
    for p in model.parameters():
        if p.dim() > 1:
            nn.init.xavier_uniform(p)
    return model```

I tried to use `torch.cat` without success.

Thanks in advance