Hi,
I have a problem where the action space constists of one continous value and the other is a selection out of three options.
For the continious action, it is simply the output of my network (the mean value for a gaussian distribution).
While I also want to use a Gaussian distribution for the policy of my discrete decision.
My question is now, how ´would I transform the continious output of my network, to a discrete decision?
I thought about calculating two boundary values, so if the output of the network is lower than bound 1, take action 1, in between bound 1 and bound 2 = action 2 and higher than bound 2 = action3.
Is there better way to do this?