ValueError: too many values to unpack (expected 2) by Changing num_layers to 2

Snowbow · October 12, 2021, 5:08pm

Hello, i’m new to Pytorch and nlp.
I’m doing a Reinforcement Learning Project and I’m using GRU as the network. But when I changing the “num_layers” to 2, it comes a ValueError:
“ValueError: too many values to unpack (expected 2)”

Here is my code:

class ActorNetwork(nn.Module):
    def __init__(self, input_size, alpha, hidden_size=64, num_layers=2):
        super(ActorNetwork, self).__init__()
        self.actor_rnn = nn.GRU(         
            input_size=input_size,
            hidden_size=hidden_size,         # rnn hidden unit
            num_layers=num_layers,           # number of rnn layer
            bidirectional=True,     
            batch_first=True,
            dropout=0.2
        )
        self.actor = nn.Linear(hidden_size * 2, 1)
        self.optimizer = optim.Adam(self.parameters(), lr=alpha)
        self.device = T.device('cuda:0' if T.cuda.is_available() else 'cpu')
        self.to(self.device)

    def init_weight(self):
        for name, param in self.actor_rnn.named_parameters():
            if 'bias' in name:
                nn.init.constant_(param, 1.0)
            elif 'weight' in name:
                nn.init.orthogonal_(param, gain=1)

    def forward(self, state):
        r_out, (h_n, h_c) = self.actor_rnn(state, None)  
        act_pri = self.actor(r_out)
        dist = T.squeeze(act_pri)  # from 3 dim to 1
        # if there is only 1 dim, it will be squeezed to a scale
        if dist.dim() == 0:
            dist = T.unsqueeze(dist, 0)
        dist = Categorical(logits=dist)
        return dist

But when the parameter num_layers is 1, the code can run normally.

ptrblck · October 13, 2021, 7:38am

As described in the docs nn.GRU yields two outputs: output and h_n.
You are unwrapping the second return value into h_n and h_c, which works fine if a single layer is used, since the shape of h_n is defined as [directions * num_layers, batch_size, H_out] (directions=2 since you are using bidirectional=True).
However, if you are using num_layers=2 then the first dimension will have a size of 4 instead of 2 so unwrapping this tensor to h_n and h_c won’t work and you might want to assign it to h_n only and split it afterwards if needed.

Snowbow · October 13, 2021, 8:40am

Thank you for the Answer! That solve my problem.