Do we need to use off-policy methods for policy shaping?

stefa91 · October 26, 2018, 9:08am

Let’s say that there is human teacher that wants to manually modify the policy of the agent (policy shaping) to speed up the learning of the agent. Do I have to use off-policy methods or I can get away with on-policy? Why?