Sequential MNIST

How can I make a transformation that results in a sequential MNIST dataset? I cannot seem to figure this out. .view(-1, 784) seems like it would work, but does seem to be correct and I get the error:
RuntimeError: size mismatch, m1: [16 x 784], m2: [28 x 400]. If I do .view(-1, 784).t() the program doesn’t have the error but the network doesn’t learn anything. This is with batch_first being set to True.

1 Like

Hi @conner

To feed an rnn model with batch_first, would need to feed tensors of size batch * sequence * dimension, even if the dimension is one.

You could do this on the dataloader level (where you don’t have the batch dimension, so the output tensor is sequence * dimension) with a transform like

transform = torchvision.transforms.Compose(
              torchvision.transforms.Lambda(lambda x: x.view(-1,1))

Then you can use that in the dataset instantiation and use that in the usual DataLoader.

If you want permuted sequential MNIST, you could take

pixel_permutation = torch.randperm(28*28)
transform = torchvision.transforms.Compose(
              torchvision.transforms.Lambda(lambda x: x.view(-1,1)[pixel_permutation])

(It is desired that the permutation is fixed.)

If you are looking at implementing this

you’d instantly have a fan-club if you post a link to your implementation. :wink:

Best regards




following on from what Thomas, @tom, said there seem to be two classes of AI “researchers”,

  1. Those that take a simple vanilla model, and then i) scale it up to more parameters/layers, ii) get it to run faster on the GPU, and distribute/parallelize it, iii) and then run it on massive datesets.

  2. The complement of class 1

If you’re working on sequential MNIST, and looking to do smart things like EUNN or DNC, then you’re in the second class :smile:, and in the long run I’m sure you’ll learn a heck of a lot more :smile:

Best of luck with your research!


Thank you for your responses @tom and @AjayTalati! Currently, I am starting with making a LSTM (and later a GRU) baseline for sequential and permutated MNIST. The current problem is that the loss refuses to go down and the accuracy remains fixed always at 11.35%. My code can be found here: If anyone would be willing to take a look, I would really appreciate it.



1 Like

Hi, I am new to this Pytorch. I want to know if my images containing numbers like 1 in one image, 2 in the next image, 3 in 3rd image, 4 in 4th image…like this in sequence one after the other the how can my predicted image by RNN model will be number 5 in an image using Pytorch?

I am also facing the same problem. Have you figured out why?

No,Not yet. But i am going with the values rather than images. Because my pixel size is more which takes lot of time for training.
Have you solved the problem?
If so, can you share in this discussion.

I do some hacks to get this working, see and an older example using a different submodule commit:

The high level idea is you want to implement a sampler that filters your labels.