How to use torch.stack in a generator

ijauregi · May 3, 2018, 1:00am

Hello,

I am trying to re-implement an answer retrieval system, previouly implemented in Theano (Sequential Matching Network (SMN)), into pytorch.

The forward method of the graph I implemented looks like this:

def forward(self, context_utts, response_utt, num_context_utt=10):
       

       #Embeddings encoding
       embeddings_context=[]
       for i in range(num_context_utt):
           embeddings_context.append(self.embedding_layer(context_utts[i]))
       embeddings_response=self.embedding_layer(response_utt)


       #GRU encoding
       gru_hidden_context=[]
       for i in range(num_context_utt):
           #packed_emb = embeddings_context[i]
           # if lengths is not None and not self.no_pack_padded_seq:
           # Lengths data is wrapped inside a Variable.
           #lengths = lengths.view(-1).tolist()
           #packed_emb = pack(emb, lengths)

           outputs, _ = self.RNN_first_layer(embeddings_context[i], None)


           # if lengths is not None and not self.no_pack_padded_seq:
           #outputs = unpack(outputs)[0]

           gru_hidden_context.append(outputs)
       #Response
       outputs, _ = self.RNN_first_layer(embeddings_response, None)
       gru_hidden_response=outputs

       #Convolution layer
       convolution_output=[]
       for i in range(num_context_utt):
           convolution_output.append(self.Convolution(embeddings_context[i],embeddings_response,gru_hidden_context[i],gru_hidden_response))


       #Stack together the convolution_output
       convolution_output=torch.stack(convolution_output,1)


       #GRU final layer
       outputs, _ =self.RNN_final_layer(convolution_output,None)


       #SMN_last (so only pick the last of the hidden states in the output
       output=outputs[:,-1,:]

       #Logistic regression
       pred_prob, pred_class=self.LogisticRegression(output)


       return pred_prob, pred_class

The code runs fine, the NLLL loss is properly calculated after and I can see that the optimizer is updating the values of the parameters after computing the gradient. The problem is that the code is a little bit slow, and I think it is mainly due to the fact that I am using iterators in the forward pass. “gru hidden context” and “convolution output” are the slowest operations of this forward pass.

I though to change the into generators. The code now looks this way:

def forward(self, context_utts, response_utt, num_context_utt=10):
       

       #Embeddings encoding
       embeddings_context=[]
       for i in range(num_context_utt):
           embeddings_context.append(self.embedding_layer(context_utts[i]))
       embeddings_response=self.embedding_layer(response_utt)


       #GRU encoding
       gru_hidden_context=(self.RNN_first_layer(embeddings_context[i], None) for i in range(num_context_utt))
       
       #Response
       outputs, _ = self.RNN_first_layer(embeddings_response, None)
       gru_hidden_response=outputs

       #Convolution layer
       convolution_output=(self.Convolution(embeddings_context[i],embeddings_response,gru_hidden_context.__next__(),gru_hidden_response) for i in range(num_context_utt))

       #Stack together the convolution_output
       convolution_output=torch.stack(convolution_output,1)


       #GRU final layer
       outputs, _ =self.RNN_final_layer(convolution_output,None)


       #SMN_last (so only pick the last of the hidden states in the output
       output=outputs[:,-1,:]

       #Logistic regression
       pred_prob, pred_class=self.LogisticRegression(output)


       return pred_prob, pred_class

Now both loops go much faster, but I have a new problem:

TypeError: stack(): argument 'tensors' (position 1) must be a tuple of Tensors, not generator

I want to keep my code fast. Is there a way of stacking the elements of a generator?

thanks in advance