Hi, I would like to know if there is a way to implement an “agent mask”. To clarify things: let’s imagine an environment that for certain steps provides data or not to the agents. Some agents have it, others do not. It would be logical that for an agent with no observation (so with agent_mask=False) the policy does not provide an action and that for the calculation of the loss, the transitions concerned are ignored…