Image Generation with LSTM

I want to use a LSTM to generate images, I have inputs images of (30,2,32,32) and target images of (30,1,32,32), How should I structure my data such I will be able generate images using the following architecture ? Or is it even possible to use this simple architecture to get reasonable results?

1 Like

To generate images I would use a GAN over a LSTM.
Take a look at these GANs for image Generation.

The image I’m generating are a sequence, that is why I’m using an LSTM, but I will have a look the GAN repo thanks.

GAN framework may be used with almost any kind of neural networks. Take a look at this papers for example:

1 Like

First, it works for sure.
Second, even though I’m unsure of this process, I think you can view the image generating lstm as the decoder lstm from Seq2Seq from NLP. https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html#sphx-glr-intermediate-seq2seq-translation-tutorial-py Your current approach might work as well; you can test the result different approaches.
Third, you don’t have to initialize the hidden states to zeros manually, Pytorch lstm does that automatically.

1 Like

Thank you very much for your help.

What do you mean by " it works for sure" you mean for my specified problem or it work in general?

Generating an image step by step using an lstm can be generalized to a sequence generating problem, and lstm is a common choice for generating sequences.

How would you configure your data if you had a inputs data set with the following dimension (batch_size=256, channels=2, height=64, width=64) to match the LSTM input format (sequence_length, batch_size, input_size) ?

You would first have to convert the image into a series of data; for example, flatten the image in some way(or flatten it using Unfold). Then reshape the tensor into the shape that match the input.