I have a 3d data as follows:
Batch_size, num_channels, num_time_series, h, w = data.size()
I want to pass this data through LSTM layer but it just take batch size, number of time series, and size of data. I can flat the data to get a 1d data but I don’t know how can I interpret the number of channels. Does it make sense to flat all h, w, and num_channels?
You don’t need to flat your data. LSTM can consume your 3d data.
x is the 3d data. of dimension
batch_size x sequence_length x embedding_dim
Thanks for your reply.
Actually, I have a conv3d layer before this LSTM layer which the Conv’s output channel is not equal to one. You can assume that we have ‘n’ channels where each channel has a 3d data (1 dimension is for sequence and other dimensions are the number of row and column). The input of LSTM should be 3 dimensions. The first one is batch size, the second one is the sequential dimension, and the third one is the size of the input. Batch size and sequence should be equal to what I have from the original data. It makes sense to flat the data in row and column direction. I cannot understand how can I deal with the corresponding dimension with the number of channels. Should I create a loop and for each channel assign an LSTM layer or I have to consider my LSTM input as input.size() = batch_size, seq_len, num_channels * w * h.