Convert RNN to FC gives error

chandra_sutrisno · September 18, 2020, 11:32am

Hi all,

I am new on this thing but would like to get an understanding of the problem I have.

I convert RNN model, like below:

class RNN(nn.Module):
def init(self, input_dim, embedding_dim, hidden_dim, output_dim):
super().init()

    self.embedding = nn.Embedding(input_dim, embedding_dim)
    self.rnn = nn.RNN(embedding_dim, hidden_dim)
    self.fc = nn.Linear(hidden_dim, output_dim)
    
def forward(self, x):

    embedded = self.embedding(x)
    output, hidden = self.rnn(embedded)
    out = self.fc(hidden)
    
    return out

To Fully Connected Network like below:

class Net(nn.Module):
def init(self, input_dim, embedding_dim, hidden_dim, output_dim):
super().init()

    self.embedding = nn.Embedding(input_dim, embedding_dim)
    self.fc1 = nn.Linear(embedding_dim,hidden_dim)
    self.fc2 = nn.Linear(hidden_dim,output_dim)
            
def forward(self, x):

    x = self.embedding(x)        
    x = self.fc1(x)        
    x = self.fc2(x)        
    
    return x

And I fed both with the same input as below:

INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 300
HIDDEN_DIM = 374
OUTPUT_DIM = 2

And below are how I feed it to both model:

model = RNN(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)

RNN(
(embedding): Embedding(20002, 300)
(rnn): RNN(300, 374)
(fc): Linear(in_features=374, out_features=2, bias=True)
)

and

model = Net(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)

Net(
(embedding): Embedding(20002, 300)
(fc1): Linear(in_features=300, out_features=374, bias=True)
(fc2): Linear(in_features=374, out_features=2, bias=True)
)

And when I ran it, the RNN model worked well but the Net model gave me error like below:

ValueError: Expected input batch_size (42) to match target batch_size (20).

I am really curious about what had happened. My assumption, the error should not happen just because I change the network architecture.

Sorry for this stupid question but I really hope I can gain more knowledge from this question.

ptrblck · September 19, 2020, 8:05am

I don’t know what kind of input shape you are using, but assume you are passing in a 3-dimensional tensor?
If that’s the case, I would recommend to check all shapes in both models and make sure the layers work as intended.

E.g. nn.RNN expects an input in the shape [seq_len, batch_size, features] by default, while nn.Linear accepts [batch_size, *, in_features].

The hidden output of nn.RNN would have the shape [num_layers*num_directions, batch_size, hidden_size]. Note that without a permute operation the self.fc layer will use dim0 as the batch dimension.