I am training an OCR model with resnet as backbone and vanilla Transformer as decoder on the Rimes dataset. I tried the following experiments and got these results:
without Augmentation: best validation loss==> 111, best training loss==>2.77 ,Test CER==> 5.75
with Augmentation: best validation loss==> 71, best training loss==>95 ,Test CER==> 10.2
Loss= KL divergence loss with label smoothing in both cases.
Augmentation function:
albumentations.Compose([
albumentations.OneOf(
[
albumentations.MotionBlur(p=1, blur_limit=5),
albumentations.OpticalDistortion(p=1, distort_limit=0.05),
albumentations.GaussNoise(p=1, var_limit=(10.0, 100.0)),
albumentations.RandomBrightnessContrast(p=1, brightness_limit=0.2),
albumentations.Downscale(p=1, scale_min=0.3, scale_max=0.5),
],
p=.5,
),
albumentations.Normalize(),
albumentations.pytorch.ToTensorV2()
])
I am very confused why validation loss decreased and CER increased. It is happening with Rimes dataset only while on other datasets like IAM and Washington there is no such issue.