How to combine additional features into word embeddings

This familiar with my ideas.

I’m re-implementing some table-to-text papers using RNN-based seq2seq (like this one

But I get trouble in understanding the representation of the input. The inside is for each word token embedding (slot values) concatenate with additional information of the field name. But the field name has several tokens (e.g day of birth) so it would be 3D input pass to Embedding layer. but word token is 2D input.

For example:

"George Mikell .... 4 April..." let say 10 tokens are slot values

and a sequence of corresponding field name (actually i’m not sure this is the meaning of the authors):

name, name, ..., day of birth, day of birth... Okay this is confused part. As the field name has several tokens. If I express it it would be (10, l) which l is max token for a filed, let say 3. → (10, 3)

Okay in the forward phase, let say batch size = 1.

So slot values: (1,10)
field values: (1,10,3)

then after embedding slot embedding → (1, 10, 64)
field embeddings → (1, 10, 3, 32)

Then how can I combine those feature?

Thank you