[solved] CNN+LSTM structured RNN

Hi Folks,

I am trying to implement this paper “https://arxiv.org/pdf/1701.01909.pdf”. In order to do that, I need to train a sequence-to-one network with following structure:

I need to train both the FC (i.e., \phi_{t}^A) and LSTM. I am new to PyTorch, and I am not sure how to build the network by using PyTorch. I would appreciate any suggestion.

1 Like

I think my understanding was wrong. I do not need to try the FC layer (i.e., \phi_{t}^A) with the LSTM. So, no problem now.

Could you please share how you handle the batch and sample timesteps in your training process? specifically hows your model forward function looks like? did you use loop to handle each frame or did you squeeze the batches and timesteps, extract the spatial feature using CNN, then unsqueeze batches and timesteps, finally fed into the LSTM. Thanks in advance.