Combine lstm with transpose CNN

is it possible to combine lstm with transpose cnn, I have a time series data where I want to use the combination of lstm and transpose cnn for image recognition from a time series data, can you plz let me know if I can do such thing in pytorch?

1 Like

This depends on the specifics of the problem. Here is a quick example showing one way you could go about this, assuming you’re dealing with a timeseries that has only one dimension per item in the sequence for a shape of [B,W]. If each item has multiple dimensions and is in shape of [B,H,W] or [B,W,H] it changes a little but not much. That being said, not 100% sure how well this would work as my focus has been on transformers rather than LSTM, might take some tweaking to get working well. One thing to note is that in the below snippet I preserved the sequence length by using a stride of (2,1), however this can obviously be changed and downsampled via conv2d if that is your goal. I also expand the unsqueezed height dimension to 2 as you’ll require more than one feature to expand in that direction (this will be less hacky if you have multiple datapoints per timestep)

One note on increasing channels: this is doable, however you’d have to take the route MMDENSELSTM uses in compressing back to a single channel via a 1x1 kernel convolution and then using that for the lstm’s input; after that you could concatenate with the input and decrease channel count to the original via a 1x1 conv2d.

a = torch.ones((1, 32))
a = a.unsqueeze(1).unsqueeze(1)
conv = nn.ConvTranspose2d(1, 1, kernel_size=3, padding=1, stride=(2,1))
b = conv(a.expand((a.shape[0], a.shape[1], 2, a.shape[3]))) # B,1,2,W -> B,1,3,W
b2 = conv(b) # B,1,5,W

dims = b2.shape[2]
lstm = nn.LSTM(dims, dims, batch_first=True, bidirectional=True)

out, _ = lstm(b2.squeeze(1).transpose(1,2)) # B,W,5 / B,W,10 if bidirectional
out = out.transpose(1,2).unsqueeze(1) # B,1,5,W