Why does data augmentation decrease the performance of efficientnet_v2_s on ILSVRC2012?

I have been trying to train the efficientnet_v2_s classification model on ILSVRC2012 using the official code provided by PyTorch. However, I have noticed that when I apply the automatic augmentation strategy, the model’s classification performance is lower compared to when no data augmentation strategy is applied, by approximately 2%. I have also tested the model’s performance with color jitter augmentation and found that it is still lower than when no data augmentation strategy is used. Could you please explain why this is the case?