Char lstm for sequence tagging

I would like to do sequence tagging where I label each character in the sentence with a binary tag. My LSTM class looks as follows. It does not use embeddings, but character indices directly.

class RNN2(nn.Module):
    def __init__(self, hidden_dim, tagset_size, batch_size):
        super(RNN2, self).__init__()
        self.hidden_dim = hidden_dim                
        self.lstm = nn.LSTM(1, hidden_dim, 1)  # seq len, hidden, num layers
        self.hidden2tag = nn.Linear(hidden_dim, tagset_size)
        self.batch_size = batch_size
        self.hidden = self.init_hidden()

        if use_cuda:            

    def init_hidden(self):
        # The axes semantics are (num_layers, minibatch_size, hidden_dim)
        c = autograd.Variable(torch.zeros(1, self.batch_size, self.hidden_dim))
        h = autograd.Variable(torch.zeros(1, self.batch_size, self.hidden_dim))
        if use_cuda:
            c = c.cuda()
            h = h.cuda()
        return (c, h)

    def forward(self, input_seqs, input_lengths):        
        packed = torch.nn.utils.rnn.pack_padded_sequence(input_seqs, input_lengths, batch_first=False)        
        outputs, self.hidden = self.lstm(packed, self.hidden)                
        outputs, output_lengths = torch.nn.utils.rnn.pad_packed_sequence(outputs)
        tag_space = self.hidden2tag(outputs)
        return tag_space

hidden_dim = 4
batch_size = 3
rnn = RNN2(hidden_dim, 2, batch_size)
if use_cuda:

When I run it as tag_scores = rnn(input_var, input_lengths), I get the following error:

/usr/local/lib/python2.7/dist-packages/torch/functional.pyc in matmul(tensor1, tensor2, out)
    166     elif dim_tensor1 == 1 and dim_tensor2 == 2:
    167         if out is None:
--> 168             return, tensor2).squeeze_(0)
    169         else:
    170             return, tensor2, out=out).squeeze_(0)

RuntimeError: Expected object of type Variable[torch.cuda.LongTensor] but found type Variable[torch.cuda.FloatTensor] for argument #1 'mat2'

However, my inputs are both LongTensor:

Variable containing:
   27     3    25
   57    21    38
   56     2    56
    2    42    61
   41    51     2
   61     2    40
   41    45    61
   55    58     2
    2     8    25
   49     2    38
   41    25    39
   58    56    41
    9     9     9
[torch.cuda.LongTensor of size 13x3 (GPU 0)]

Variable containing:
    1     0     0
    0     1     0
    0     0     0
    0     1     0
    0     0     0
    0     0     0
    1     0     0
    0     0     0
    0     0     1
    0     0     0
    0     0     0
    1     1     1
    0     0     1
[torch.cuda.LongTensor of size 13x3 (GPU 0)]

Can anyone help resolve the issue and explain why it happens?

I also posted the question on SO:

Thank you!

The LSTM wants floats, so you could use .float() on the inputs (the full backtrace likely would have that the matmul args are input and weight or somesuch a level up).

Best regards


PS I don’t think it is a good idea to post “help me with this error” question in multiple forums. It doubles the time people spend answering, so with a fixed amount of ressources only half the questions can be answered.

1 Like

Thanks. That works, I only needed to reshape my input_var into shape seq length X batch X 1