Actor-Critic with discrete Actions

codeflux · December 19, 2021, 4:22pm

Hi,

I have a problem where the action space constists of one continous value and the other is a selection out of three options.

For the continious action, it is simply the output of my network (the mean value for a gaussian distribution).
While I also want to use a Gaussian distribution for the policy of my discrete decision.

My question is now, how ´would I transform the continious output of my network, to a discrete decision?

I thought about calculating two boundary values, so if the output of the network is lower than bound 1, take action 1, in between bound 1 and bound 2 = action 2 and higher than bound 2 = action3.

Is there better way to do this?