Combining learnt policies (from different leanring algoritms)

Are you guys aware of any techniques of combining learnt policies obtained from using different learning algorithms

Ie image you have a domain and 2, and you apply 2 different learning algorithms (algo A & algo B) and algo A finds some policy trajectories while algo B tents to find other (but not mutually exclusive) is there a way combining them to try to get the best of both worlds

if possible please attach a paper to you apply