I have coded the ShallowNet for Saliency prediction. The code seems to work fine. The authors of the ShallowNet paper says the model needs to be trained for 1000 epochs. When I put my model for training, my initial validation loss was 0.024888 but after the 1st epoch it became 0.009461 and in the subsequent epochs it reduces marginally.
My question are :-
- Is it advisable to continue training till 1000 epochs.
- Does the model still learns even though the validation error seems to have saturated.
- The authors have implemented using caffe and I am trying to write it in pytorch. Is it possible that the change of framework could effect the hyperparameters.
- In the paper the authors have reduced the momentum term from 0.9 to 0.999 over the 1000 epochs. I guess there is no scheduler for momentum like we have for LR in pytorch. How it can be done.
Thank you in advance for your valuable time and helping me learn.