CNN and Actor Critic


When using convolutional neural networks with reinforcement learning, it is best to have the critic and the policy share the convolutional layers or rather create two separate entities of it ?

Thanks !

my thinking is that it’s probably best to share conv layers in this case.

1 Like

In the general sense of Actor-Critic family of algorithms, there is no need to share the network parameters. You could have total separate two networks. That would cost you more memory and compute and most likely take longer.

Sharing the parameters improves the training time however it makes it harder to train because the parameters need more careful tuning. This is because the norm of gradients flowing back from the actor gradients and the critic gradients are at completely different scales which is hard to calibrate.