Algorithm Suggestions

Hello,

I am attempting to apply RL to networking which would have a continuous state space, consisting of dynamic network statistics, and discrete action space, a list of integers, 0-9, to be chosen from; any input or suggestions for algorithms to apply as I’m seeing most algorithms only deal with both being continuous or discrete, not a combination of the two?

You can structure a model to give two outputs, one discrete and one continuous. Then run the appropriate loss functions on both and add the two losses before backprop.

It might look something like:

...
forward(self, x):
    x = layer1(x)
    x = layer2(x)
    ...
    y = x.clone()
    x = cont_layer(x)
    y = disc_layer(y)
    return x, y
...

loss1 = criterion1(x)
loss2 = criterion2(y)

total_loss = loss1 + loss2
...

But, keep in mind, some loss functions may need to be scaled by multiplying by a constant before adding. For example, l1 or l2 loss maybe give you very large numbers when compared to cross entropy. So multiplying by some constant may help the model learn better from both and not only focus on optimizing for one.