How to convert a cudnn.BLSTM model to nn.LSTM bidirectional model

Rafi · April 4, 2019, 7:42pm

I have a *.t7 model that consists in few convolution layers and 1 block of cudnn.BLSTM(). To convert the model to pytorch, I create the same architecture with pytorch and try to get the weights from the t7 file. I think the convolution layers were correct but I have a doubt about the cudnn.BLSTM. When I extract the BLSTM weighs, I got one dimentional list of millions of parameters which corresponds to the same numbers of parameters in pytorch LSTM. However, in pytorch the weights and biases are with well know structure and weight_ih_l0, weight_hh_l0,… bias_ih_l_0, bias_hh_l0, … weight_ih_l0_reverse, … but in the cuddnn.BLSTM(), all parameters are set in one flattened list, so how to know the order and the shape of weights and biases ??
I debug the cudnn.BLSTM structure on th terminal and I get some idea about the concatenation orders and the shape:
Exemple

# torch
rnn = cudnn.BLSTM(1,1, 2, false, 0.5)
# get the weights
weights = rnn:weights() 
th> rnn:weights()
{
  1 : 
    {
      1 : CudaTensor - size: 1
      2 : CudaTensor - size: 1
      3 : CudaTensor - size: 1
      4 : CudaTensor - size: 1
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
  2 : 
    {
      1 : CudaTensor - size: 1
      2 : CudaTensor - size: 1
      3 : CudaTensor - size: 1
      4 : CudaTensor - size: 1
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
  3 : 
    {
      1 : CudaTensor - size: 2
      2 : CudaTensor - size: 2
      3 : CudaTensor - size: 2
      4 : CudaTensor - size: 2
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
  4 : 
    {
      1 : CudaTensor - size: 2
      2 : CudaTensor - size: 2
      3 : CudaTensor - size: 2
      4 : CudaTensor - size: 2
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
}

biases = rnn:biaises()

th> rnn:biases()
{
  1 : 
    {
      1 : CudaTensor - size: 1
      2 : CudaTensor - size: 1
      3 : CudaTensor - size: 1
      4 : CudaTensor - size: 1
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
  2 : 
    {
      1 : CudaTensor - size: 1
      2 : CudaTensor - size: 1
      3 : CudaTensor - size: 1
      4 : CudaTensor - size: 1
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
  3 : 
    {
      1 : CudaTensor - size: 1
      2 : CudaTensor - size: 1
      3 : CudaTensor - size: 1
      4 : CudaTensor - size: 1
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
  4 : 
    {
      1 : CudaTensor - size: 1
      2 : CudaTensor - size: 1
      3 : CudaTensor - size: 1
      4 : CudaTensor - size: 1
      5 : CudaTensor - size: 1
      6 : CudaTensor - size: 1
      7 : CudaTensor - size: 1
      8 : CudaTensor - size: 1
    }
}

all_flattened_params = rnn:parameters()

with this small example: I see that the rnn:parameters() function put the weighs and after that the biases in the above order. So:
weights =all_flattened_params[:-32]
biases = all_flattened_params[-32:]
Now, How to know the order of weights and biases regarding the pytorch nn.LSTM() ?
I supposed that this order:
weight_ih_l0, weight_hh_l0, weight_ih_l0_reverse, weight_hh_l0_reverse, weight_ih_l1, …
bias_ih_l0, bias_hh_l0, bias_ih_l0_reverse, bias_hh_l0_reverse, … but my model does not give the right output!!