Are you guys aware of any techniques of combining learnt policies obtained from using different learning algorithms
Ie image you have a domain and 2, and you apply 2 different learning algorithms (algo A & algo B) and algo A finds some policy trajectories while algo B tents to find other (but not mutually exclusive) is there a way combining them to try to get the best of both worlds
if possible please attach a paper to you apply