Some small questions about building LSTM with char embeddings

Hey guys!

After working through the pytorch tutorial “Classifying Names with a Character-Level RNN”, I’d like to implement a character-level bidirectional LSTM. I kinda understand how LSTM work, but I never coded one by myself. So I have a few trivial questions (.-.)

My dataset consists of medieval dialect corpuses. I have four labels (dialects).

The documentation of torch.nn.LSTM() says, I need these as parameters:

input_size – The number of expected features in the input x
hidden_size – The number of features in the hidden state h
num_layers – Number of recurrent layers.

  1. Question: Can I just give these parameters the values I like? Or are they dependend on the input tensor?

The input tensor should be (seq_len, batch, input_size).

  1. Question: When I want to classify one word (i.e. “abc”) of a specific dialect corpus on char-level, should I feed the lstm the whole word as a tensor with seq_len = 3? Or should I split the word and give one character after another?

  2. How do I generate character embeddings?

  3. Question: How should the label tensor look like? I know that for calculating the loss, I need to give the “criterion” the output of the lstm and also the label tensor. I tried to to that, but the label tensor only has a size of 1, which doesn’t seem right to me (I hope this question is not too out of context).

Thank you so much for helping! It’s part of my bachelor thesis and I really need to know, how that could work.