Is it possible to achieve an accuracy of around 90% on CIFAR10 with just a feedfoward net?

I was wondering if it’s possible, since all the examples I found online use convnets. I am trying to get an ok accuracy (say 90%) with just a feedforward net and no convolutions.

I know the convnet is the way to go, but I am still curious.

For example, even the pytorch tutorial uses a convnet for MNIST, but I can get around 97% accuracy in it with a feedforward without convolutions.

So far however, for CIFAR10 I’m only achieving 28 to 30% accuracy with the following setup: (linear = linear combination plus bias)

  1. linear(32323, 300), ReLU
  2. linear(300, 200), ReLU
  3. linear(200, 100), ReLU
  4. linear(100, 50), ReLU
  5. linear(50, 10), Softmax
    . Cross Entropy Loss

Question: the smallest CNN that achieves >90% accuracy on CIFAR10 has how many layers?

I ask that question because you are trying to get a similar result with only 5 layers. In addition to adding layers, you could try adding some of the best practices that help CNNs, for example…

You might find that adding batchnorm before the ReLUs helps, just as it often does with CNNs.

You could try introducing residual blocks. For example, in between layers 1 & 2 you could add something like this.

# Feed layer1_output through
linear(300, 300), ReLU, linear(300, 300), ReLU -> intermediate_result
# Then add skip connection
layer2_input = intermediate_result + layer1_output

That is going to make for a lot of parameters, so what about using some sort of bottleneck. e.g.

# Reduce size
linear(300,100)
# Feed through
linear(100, 100), ReLU, linear(100, 100), ReLU
# Increase size
linear(100, 300) -> intermediate_result
# Then add skip connection
layer2_input = intermediate_result + layer1_output

This has 80 000 (not counting the bias) parameters instead of 180 000 (not counting the bias) for the previous example, and who knows, it might work just as well.

I am sure there are other tricks used in CNNs that could be ported to a feed forward net.