[solved] Batching process of torch.nn.Linear

When we do forward step of RNN, we could get a tensor like (batch, hidden layer, data_in).

How can I use torch.nn.Linear to reduce the dimension of my data_in?

  1. RNN output: (batch, hidden layer, data_in)
  2. I only interested in the last layer, so indexing by (:, -1, :) then we got (batch, data_in).
  3. The dimension of data_in is not what I want, so I wanna apply Linear on it.
    (However, using view to flatten data_in first is not what I want. Because this will binding my Linear layer parameters with batch size.)
    Should I just use for loop to calculate each batch?

Example:

class MyNetwork(nn.Module):
    def __init__(self, batch):
        super(Vinet, self).__init__()
        self.batch = batch
        self.rnn = nn.LSTM(
            input_size=24576, 
            hidden_size=64,
            num_layers=3,
            batch_first=True)
        self.rnn.cuda()
        
        self.linear = nn.Linear(320,7)  #nn.Linear(64,7)
        self.linear.cuda()

    def forward(self, x):

        r_out, (h_n, h_c) = self.rnn(x)  #x.shape = (5, 2, 24576)
										 #r_out.shape = (5, 2, 64)		

		# (5, 2, 64) => (5, 64)	==(linear)==> (5,7)									 

Related post:

  1. Inferring shape via flatten operator
    The solution provided by fmassa will restrict the batch size of the network. For example, I train the network with batch size 5, but I want to use batch size 1 in testing. It will have dimensional mismatch for fmassa’s approach.
1 Like

The linear layer should be

self.linear = nn.Linear(64, 7)

then this should work

out = self.linear(r_out[:,-1,:])
1 Like