Seq2seq: For unbatched 2-D input, hx and cx should also be 2-D but got (3-D, 3-D) tensors

Ninlawat_Phuangchoke · April 26, 2024, 4:01pm

I debug program each step that the first is encoder layer I don’t got any error as below

input_dim = len(en_vocab) 
output_dim = 11
encoder_embedding_dim = 512
decoder_embedding_dim = 512
hidden_dim = 1024
n_layers = 2
encoder_dropout = 0.5
decoder_dropout = 0.5
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

encoder = Encoder(
    input_dim,
    encoder_embedding_dim,
    hidden_dim,
    n_layers,
    encoder_dropout,
)
hidden, cell = encoder(batch["numericalized_input"])
hidden.shape, cell.shape

Output:

src shape: torch.Size([26, 3])
embedded shape: torch.Size([26, 3, 512])
hidden shape: torch.Size([2, 26, 1024])
(torch.Size([2, 26, 1024]), torch.Size([2, 26, 1024]))

My decoder layer

class Decoder(nn.Module):
    def __init__(self, output_dim,hidden_dim, n_layers, dropout):
        super().__init__()
        self.output_dim = output_dim
        self.hidden_dim = hidden_dim
        self.n_layers = n_layers
        self.rnn = nn.LSTM(output_dim, hidden_dim, n_layers, dropout=dropout,batch_first=True)
        self.fc_out = nn.Linear(hidden_dim, output_dim)
        self.dropout = nn.Dropout(dropout)

    def forward(self, input, hidden,cell):
        input = input.unsqueeze(0) 
        input = input.float()
        print('input :',input)
        output,(hidden,cell) = self.rnn(input,(hidden,cell))
        prediction = self.fc_out(output.squeeze(0))
        return prediction,hidden, output

When I run this program

decoder = Decoder(
    output_dim,
    hidden_dim,
    n_layers,
    decoder_dropout,
)
prediction, hidden, output = decoder(batch["target"][0], hidden,cell)
prediction.shape, hidden.shape

I’m getting the following error

RuntimeError: For unbatched 2-D input, hx and cx should also be 2-D but got (3-D, 3-D) tensors

Anyone with the solution. Kindly reply. Thanks in advance.

vdw · April 27, 2024, 1:28am

What are you trying to do in the first place, i.e., what kind of task are you trying to learn.

I’m just asking since it seems odd that batch["target"][0] is the input parameter of the forward() method of the decoder.

Ninlawat_Phuangchoke · April 27, 2024, 1:46am

I try to do seq2seq model with text as input and numerical data that is output. I prepare data text to word index to pass through encoder and numerical data for decoder

batch_size = 3
train_data_loader = get_data_loader(train_data, batch_size, pad_index, shuffle=True)

I just run to batch["numericalized_input"], batch["target"] to debug the program

for i, batch in enumerate(train_data_loader):
    print('i', i)
    src = batch["numericalized_input"]
    trg = batch["target"]
    print('src:',src)
    print('trg:',trg)

Output

i 0
src: tensor([[ 2, 2, 2],
[ 4, 4, 4],
[ 0, 0, 0],
[ 5, 5, 5],
[ 4, 4, 4],
[ 0, 0, 0],
[ 0, 0, 6],
[18, 3, 0],
[ 4, 1, 0],
[23, 1, 4],
[ 6, 1, 11],
[ 5, 1, 0],
[ 0, 1, 7],
[ 9, 1, 0],
[ 0, 1, 7],
[ 0, 1, 6],
[21, 1, 13],
[ 7, 1, 4],
[20, 1, 16],
[ 8, 1, 8],
[ 4, 1, 17],
[11, 1, 8],
[ 3, 1, 4],
[ 1, 1, 15],
[ 1, 1, 18],
[ 1, 1, 4],
[ 1, 1, 14],
[ 1, 1, 3]])
trg: tensor([[ 2, 2, 2],
[ 9, 15, 7],
[ 6, 5, 8],
[ 7, 7, 9],
[22, 6, 22],
[ 5, 9, 5],
[15, 8, 15],
[15, 9, 15],
[22, 8, 22],
[33, 9, 33],
[ 3, 3, 3]])
i 1
src: tensor([[ 2, 2, 2],
[ 9, 19, 6],
[ 0, 5, 5],
[ 4, 12, 0],
[11, 0, 9],
[24, 3, 4],
[ 0, 1, 0],
[ 0, 1, 7],
[22, 1, 6],
[ 0, 1, 5],
[10, 1, 0],
[ 0, 1, 0],
[ 4, 1, 0],
[16, 1, 4],
[ 8, 1, 15],
[17, 1, 21],
[ 0, 1, 7],
[ 6, 1, 20],
[ 0, 1, 0],
[ 0, 1, 24],
[10, 1, 0],
[ 0, 1, 7],
[ 4, 1, 0],
[23, 1, 12],
[ 3, 1, 0],
[ 1, 1, 3]])
trg: tensor([[ 2, 2, 2],
[ 8, 22, 9],
[ 9, 5, 5],
[ 4, 8, 33],
[22, 22, 7],
[ 5, 5, 8],
[15, 15, 9],
[15, 15, 7],
[22, 22, 8],
[33, 33, 9],
[ 3, 3, 3]])