Generally speaking, when we have two networks for Actor and Critic, namely actor_net
and critic_net
(with no shared modules or layers), which way (described below) is better to perform the optimization in terms of best practices and standards?
1. Separate:
class ActorCritic:
def __init__(*args):
...
self.actor_optim = Adam(self.actor_net.parameters(), lr=self.lr_actor)
self.critic_optim = Adam(self.critic_net.parameters(), lr=self.lr_critic)
...
def learn(*args):
for step in total_steps:
... # collect rollout
for update in num _updates_per_step:
... # forward pass
actor_loss = ...
critic_loss = nn.MSELoss(..., ...)
# backprop for actor network
self.actor_optim.zero_grad()
actor_loss.backward(retain_graph=True)
self.actor_optim.step()
# backprop for critic network
self.critic_optim.zero_grad()
critic_loss.backward()
self.critic_optim.step()
...
or
2. Integrated:
class ActorCritic:
def __init__(*args):
...
self.optim = optim.Adam([
{'params': self.actor_net.parameters(), 'lr': self.lr_actor},
{'params': self.critic_net.parameters(), 'lr': self.lr_critic}
])
...
def learn(*args):
for step in total_steps:
... # collect rollout
for update in num _updates_per_step:
... # forward pass
actor_loss = ...
critic_loss = nn.MSELoss(..., ...)
# calc total loss
total_loss = actor_loss + critic_loss
self.optim.zero_grad()
total_loss.backward()
self.optim.step()
...
First of all, are both logically correct? And if yes, does make a difference which to use? and if so, which is preferred?
I haven’t experimented on it yet, but plan to do so. I wanted to see what the community’s opinion is on it.