Can I backprop during one of output tensor detached or attached based on one boolean variable?

Uk_Jo · November 8, 2018, 11:01am

Hi all,

I am building a multi agent reinforcement learning using ddpg where each agent has parameterized action spaces. If some action is chosen, the other low level action parameters become useless. In case of that, I don’t want policy network to be trained, so that I thought if the output value is detached, It won’t be trained. But DDPG is using sample batch. Is it possible to customize backward process by myself?