Hi,
I’m fairly new to PyTorch and I’m trying to design an 18 node LSTM using LSTMCell with Teacher Forcing. I have quite a few difficulties.
Here’s my model:
class tryLSTM(nn.moduleList):
def __init__(self, input_size, hidden_size, batch_size):
super(tryLSTM, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.batch_size = batch_size
self.lstm0 = nn.LSTMCell(input_size, hidden_size, bias=True)
self.lstm1 = nn.LSTMCell(input_size, hidden_size, bias=True)
self.lstm2 = nn.LSTMCell(input_size, hidden_size, bias=True)
.........
self.lstm17 = nn.LSTMCell(input_size, hidden_size, bias=True)
def init_hidden(self):
# initialize the hidden state and the cell state to zeros
hidden = torch.zeros(self.batch_size, self.hidden_size)
cell = torch.zeros(self.batch_size, self.hidden_size)
return hidden, cell
def forward(self, x, hc):
out = []
h_0, c_0 = hc
h_1, c_1 = self.lstm1(x[0], h_0, c_0)
out[0] = h_1
h_2, c_2 = self.lstm2(x[1], h_1, c_1)
out[1] = h_2
......
h_17, c_17 = self.lstm17(x[16], h_16, c_16)
out[16] = h_17
model = tryLSTM(input_size=128, hidden_size=128, batch_size=18)
if gpu: model.cuda()
optimizer = optim.Adam(model.parameters(), lr=0.0001)
criterion = nn.BCELoss(weight=None, reduction='mean')
here’s the training loop:
def train(epoch):
model.train()
# initialize hidden and cell state
hc = model.init_hidden()
for batch_idx, (data, target) in enumerate(train_loader):
# Zero out the gradients
optimizer.zero_grad()
target = data[1:]
print(target.size())
# Put data on GPU
if gpu:
data = data.cuda()
target = target.cuda()
# Get outputs of LSTM
output = model(data, hc)
print(output.size)
# Calculate loss
loss = criterion(output, target)
# Calculate gradients
loss.backward()
# Update model parameters
optimizer.step()
train_loss.append(loss.item())
Q.1I’m getting the following error:
TypeError: forward() takes from 2 to 3 positional arguments but 4 were given
Q.2
I’m not sure if this is the correct way to build what I want.
My mini batch is X(18,3,128,128) These are 18 images. What I want to achieve is as follows:
The 1st cells input is x[0] and output h_1 should be similar to x[1].
2nd cells input is x[1] and h_1 and output h_2 should be similar to x[2]
and so on.
I believe the forward pass is run once for each image in the mini batch. So for a mini batch containing 18 images the forward defined above will run 18 times? That is not desired at all. What I want to do is run it once per mini batch but I need to pass in all 18 images as I’m using teacher forcing.
What am I doing wrong? Is there a better way to build this architecture?
Please help, Thanks!