I’m fairly new to PyTorch and I’m trying to design an 18 node LSTM using LSTMCell with Teacher Forcing. I have quite a few difficulties.
Here’s my model:
class tryLSTM(nn.moduleList): def __init__(self, input_size, hidden_size, batch_size): super(tryLSTM, self).__init__() self.input_size = input_size self.hidden_size = hidden_size self.batch_size = batch_size self.lstm0 = nn.LSTMCell(input_size, hidden_size, bias=True) self.lstm1 = nn.LSTMCell(input_size, hidden_size, bias=True) self.lstm2 = nn.LSTMCell(input_size, hidden_size, bias=True) ......... self.lstm17 = nn.LSTMCell(input_size, hidden_size, bias=True) def init_hidden(self): # initialize the hidden state and the cell state to zeros hidden = torch.zeros(self.batch_size, self.hidden_size) cell = torch.zeros(self.batch_size, self.hidden_size) return hidden, cell def forward(self, x, hc): out =  h_0, c_0 = hc h_1, c_1 = self.lstm1(x, h_0, c_0) out = h_1 h_2, c_2 = self.lstm2(x, h_1, c_1) out = h_2 ...... h_17, c_17 = self.lstm17(x, h_16, c_16) out = h_17 model = tryLSTM(input_size=128, hidden_size=128, batch_size=18) if gpu: model.cuda() optimizer = optim.Adam(model.parameters(), lr=0.0001) criterion = nn.BCELoss(weight=None, reduction='mean')
here’s the training loop:
def train(epoch): model.train() # initialize hidden and cell state hc = model.init_hidden() for batch_idx, (data, target) in enumerate(train_loader): # Zero out the gradients optimizer.zero_grad() target = data[1:] print(target.size()) # Put data on GPU if gpu: data = data.cuda() target = target.cuda() # Get outputs of LSTM output = model(data, hc) print(output.size) # Calculate loss loss = criterion(output, target) # Calculate gradients loss.backward() # Update model parameters optimizer.step() train_loss.append(loss.item())
Q.1I’m getting the following error:
TypeError: forward() takes from 2 to 3 positional arguments but 4 were given
I’m not sure if this is the correct way to build what I want.
My mini batch is X(18,3,128,128) These are 18 images. What I want to achieve is as follows:
The 1st cells input is x and output h_1 should be similar to x.
2nd cells input is x and h_1 and output h_2 should be similar to x
and so on.
I believe the forward pass is run once for each image in the mini batch. So for a mini batch containing 18 images the forward defined above will run 18 times? That is not desired at all. What I want to do is run it once per mini batch but I need to pass in all 18 images as I’m using teacher forcing.
What am I doing wrong? Is there a better way to build this architecture?
Please help, Thanks!