I have the following question. I am training a convolutional neural network on CIFAR-10 dataset. Here is my structure of the fully connected layer
- Linear layer
I have observed that when I print the loss( cross entropy with L2 regularization) at each epoch, my loss is a constant equal 2.3036. However, when I remove the relu and softmax activations, and replace (5) with a linear layer, this is not the case.
In either case my network does not learn and I am getting very low accuracies on train set.