I’m running ppo lagrangian algorithm from following repo: GitHub - akjayant/PPO_Lagrangian_PyTorch: Implementation of PPO Lagrangian in PyTorch
By changing one of the model’s architecture from:
> class MLPCritic(nn.Module):
> def __init__(self, obs_dim, hidden_sizes, activation):
> super().__init__()
> self.v_net = mlp([obs_dim] + list(hidden_sizes) + [1], activation)
> def forward(self, obs):
> return torch.squeeze(self.v_net(obs), -1) # Critical to ensure v has right shape.
To:
class MLPCritic_affine_Q_wide(nn.Module):
def __init__(self, obs_dim, ac_dim, hidden_sizes, activation):
super().__init__()
self.q_w_net = mlp([obs_dim] + list(hidden_sizes) + [obs_dim], activation)
self.fuse_layer = mlp([obs_dim + ac_dim] + [1], activation)
def forward(self, obs, act):
w = self.q_w_net(obs)
result = self.fuse_layer(torch.cat((w, act), dim=-1))
return result
During a simple test where:
obs_dim = 2
act_dim = 1
The backward updating time increased by 10 times, but there is no significiant increase of parameter size and model deepth.
Why such minor change to the network has a profound impact to the running time?