I am trying to understand if the validation loss should decrease constantly or can have the shape I am having in this case. I wonder because the validation accuracy does grow constantly as expected.
Resnet56 trained on CIFAR10.
Optimizer = SGD
LR = 0.1 and decreasing to 0.01 and 0.001 at epochs 100 and 150 respectively.
Weight decay = 1e-4
Momentum = 0.9
Data augmentation according to the paper.