A couple of things that strike me as odd:
-
You do
output = output.view(121, -1, 40)
and then pushingoutput
through your GRU layer. But you defined you GRU layer withbatch_first=False
that means the 121 will be interpreted as sequence length. However, 121 seems to be yourhidden_size
-
I’m not sure if using
view()
is the right way to get the shapes right; have a look a post of mine.
In short, I’m pretty sure you mess up your inputs so the network can’t learn anyting meaningful.