How to make an algorithm to learn some actions more than others in a multi action env

Hi, I have a multi action env and I wanted to my algorithm(ppo) to consider learning some of my actions first, to do so I thought to detach unwanted actions for some epochs and then back to normal again. I implemented this with an if in forward method ( I pass a parameter to point if it must be detached or not?) do you think this implementation is right?
for e.g we have 4 actions and I want to learn action 1 and 2 more like below:
model