Backpropagation Over Summed Hidden States in BiRNN

Hello, I am trying to implement my own Bidirectional RNN Model where the output is defined as the sum of hidden states from both directions. I want to know if autograd can backprop through this summation.

def forward(self,x,state=None):
        
        hidden_f = []  # forward direction hidden states
        hidden_b = []  # backward direction hidden states
        
        dimensions = x.size()
        batch_sz, sequence_len = dimensions[0],dimensions[1]
        
        hidden_sz = self.cell.hidden_size
        
        if state == None:
            h_n_f = torch.zeros(batch_sz,hidden_sz) 
            h_n_b = torch.zeros(batch_sz,hidden_sz)
        else:
            h_n_f = state[:,:,0] 
            h_n_b = state[:,:,1] 
            
        for i in range(sequence_len):
            
            h_n_f = self.cell(x[:,i,:],h_n_f) # Calculate next hidden state in forward direction    
            hidden_f.append(h_n_f)

            h_n_b = self.cell(x[:,-i-1,:],h_n_b) # Calculate next hidden state in backward direction  
            hidden_b.append(h_n_b)

        # MY PROBLEMS START HERE
        # ,requires_grad=True ==> leaf node moved to interior
        output = torch.zeros(batch_sz,sequence_len,hidden_sz) # Recording the outputs
        
        #output = variable(torch.zeros(batch_sz,sequence_len,hidden_sz)) 
        
        # COULD BE PROBLEMATIC FOR BACKPROP
        for i in range(sequence_len):
            output[:,i,:] = hidden_f[i]+hidden_b[-i-1] # Outputs are defined as the sum of both direction
            
        return output

Hi,

Do you actually see an error?
From your comments, you should not set requires_grad=True on the output if you don’t need to access its .grad field later.
And you don’t need to wrap things in Variables
So it should work just fine yes.

Thank you! The code executes just fine and I am not seeing any errors. I just wanted to be sure that the backpropagation can take into account the sum of the two hidden states from both directions inside the final for loop, which is a little strange operation. I am guessing that for loops and list summations are taken care in autograd and I don’t need to do .unbind() or any other kind of operation. Also I was confused as I am initializing the root variable output as zeros and than I am setting each entry using other varaibles.

Yes that will work just fine.
In general, if the autograd computes something, it is right. We (should) always raise an error if something would give a wrong result!

1 Like