Over fitting in ResNet 34

Hi Everyone,

I am training ResNet34 from scratch for three class classification on 23823 patches. I have tried using adam optimizer with different learning rates and reduce on LR plateau scheduler, but the training loss decreases, whereas the validation loss decreases for a while and then increases. I have used data augmentation such as random rotate and horizontal flip since I work on gray scale images, I cannot use more data augmentation operations. I have tried using weight decay also and as well as SGD optimizer, but the ResNet34 model overfits.

I have tried batch sizes of 4,16 and 32 also.
I have attached the training and validation loss curve. Please suggest where I might be going wrong.

I am facing the same problem with resnet50 model. I am using the 32 batch size and had done some similar types of augmentations.