I think this is different, I am trying to do something similar to what is presented in this paper.
Here the input is an image, and the states are also multichannel images. The input to hidden and hidden to hidden operation are convolutions instead of matrix vector multiplications…
In your code you just convert the output of a CNN to a vector and use the regular LSTM.
You have done a great work, I am also interested in CLSTM and want to do something using it.
I don’t know how it run in your machine, but I can’t run your code directly, so I rewrite some parts and it can run well with these changes, I changed the loop in CLSTM.forward to:
for idlayer in xrange(self.num_layers):
for t in xrange(seq_len):
Does these changes conflict with your original intension?
There was indeed an error in the input format (batch, seq_len,…). It happened because I used the right format in my own code, and I put a wrong one in GitHub. Could you please check again? Let me know if you still have any issues.