From the documentation, i learn that nn.Embedding
takes “LongTensor of arbitrary shape containing the indices to extract” as input.
But let’s say i have a data field named movie_genre for each sample movie
, it is selected from the following genres:
Action
Adventure
Animation
Comedy
...
And for each movie
, it might contain multiple genres:
mid genres
1 Action | Adventure
2 Animation
3 Comedy | Adventure | Action
If i use one hot vector to encode the genre
, Action can be encoded as (1, 0, 0, 0), Adventure can be encoded as(0, 1, 0, 0), and so on.
So movie with mid1 can be encoded as (1, 1, 0, 0), mid2’s genre can be encoded as (0, 0, 1, 0), and so on.
And how should i do to make nn.Embedding
support such kind of input?