I have a use case, where I think using two optimisers would help the model learn better. Let me explain: so, I have one model like so:
which is trained over
N samples belonging to training_set1. Now, I use results of this trained model (embeddings to be precise) to generate another training_set2, which is a much
harder_to_learn training set than training_set1 (btw: I cannot generate training_set2 without training on training_set1).
So, my plan is to use two optimisers like so:
my_model=Net() optimiser1=Adam(my_model.params()) # rough syntax ;) optimiser2=Adam(my_model.params()) # rough syntax ;) for epoch in epochs: # train my_model on train_set1 using optimiser1 # generate training_set2, much harder samples # and run a small number of epochs on these harder samples for training_set2_epoch in training_set2_epochs: # train my_model using optimiser2
is this a valid method to train a single model?
When I use two optimisers like above, my results are much better than when I just use a single optimiser to train the model. Is this fluke ?