My single fold model performed well(unseen test data) with a train validation slit of 80%,20% respectively.
Here are more details about the single fold model.
Seed = 42, batch_size = 16, Epoch = 15, StepLR (step_size=5,factor=0.1), TTA = 6,
Image_shape( 320x320), LR = 0.0005.
Now I want to implement StratifiedKfold with n=5. I implemented with the same combination above to the StratifiedKfold and it performed worse than the single fold model(unseen test data).
I know the problem is with Epoch, StepLR( step_size) and TTA.
My question is,
- Whether I should use TTA for each fold or not? ( because I heard that averaging 2 weaker models is better than 2 best model something like that)
- What should be my epoch range? or should I follow epoch=15?
- What should be the step_size for stepLR?