Combining lstm and gru

D_Liebman · November 15, 2018, 7:21pm

combining lstm and gru

I’m working on a project where I want to use the output of some nn function as the hidden state in an lstm. The nn function requires the use of a gru. Essentially my question is, how do I specify the output from a gru as the hidden input to an lstm? GRU output is “hidden-state” while lstm is “hidden-state” plus “cell-state”. Up to now I’ve been setting the cell state to zeros and the hidden state to the output of the gru. I want to generate sentences. I will include some code but it may not be self explanatory.

        token = SOS_token
        outputs = []
        decoder_hidden = decoder_hidden.permute(1,0,2)
        self.h0 = nn.Parameter(decoder_hidden, requires_grad=False)

        for i in range(self.maxtokens):

            output = self.embed(Variable(torch.tensor([token])))
            output = self.dropout_b(output)
            
            output, (hn , cn) = self.lstm(output, (self.h0, self.c0))
            self.h0 = nn.Parameter(hn, requires_grad=False)
            self.c0 = nn.Parameter(cn, requires_grad=False)

            output_x = self.out_c(output)
            output_x = self.dropout(output_x)
            outputs.append(output_x)
            token = torch.argmax(output_x, dim=2)

decoder_hidden is the values I want to pass to the lstm hidden input. SOS_token is ‘start of sentence’. maxtokens is the number of words per sentence. embed is the embedding matrix. outputs is the array of outputs. The goal is to pass the output of another nn to the input of the lstm as a ‘thought-vector’. I don’t know, will that work?