Huge input size in RNN

Hi everyone.
My data size is [32,10,40,300] ([batch size, context length, utterance length, embedding dimension]).
However, the input size of RNN is [batch size, sequence length, embedding dimension].
Thus, I reshape my data to [32,10,40 * 300] ([batch size, context length, utterance length * embedding dimension]) to learn the context representation.
Is there any impact when I input this huge matrix into the RNN?

Another way I thought of is to use CNN + max pool to get the feature and let input size become [32,10,200] ([batch size, context length, feature dimension]) before inputting to the RNN.
Is the second way better than the first?