I am doing action recognition. Each of my frames has a size of [3, 50] and I feed them into the MLP whose output is then fed to my GRU. I wonder if the input_size to my GRU (or the output of the MLP) should be [3x50].
Thank you in advance.
I am doing action recognition. Each of my frames has a size of [3, 50] and I feed them into the MLP whose output is then fed to my GRU. I wonder if the input_size to my GRU (or the output of the MLP) should be [3x50].
Thank you in advance.