Hi,
Our project is to use meta to assist with ddpg training. Both meta.py and ddpg.py have the class Actor and the class Critic.
The class Actor of ddpg and meta is the same content but in different files. One is in ddpg.py, another one is in meta.py. So does the Critic.
When ddpg training and meta training are operated independently, the execution results are normal.
But when both files are executed at the same time, the following error occurs.
This is the error message that I get after I add the torch.autograd.set_detect_anomaly(True).
[W ..\torch\csrc\autograd\python_anomaly_mode.cpp:104] Warning: Error detected in MmBackward0. Traceback of forward call that caused the error:
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\main2.py", line 843, in <module>
train(on_group = ON_GROUP) #train(scenario_list=train_set)
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\main2.py", line 222, in train
meta_action = meta_rl.choose_action(meta_state, meta_state_mcs)
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\META_ori_local.py", line 251, in choose_action
action = self.actor(self.state_concatenate)#.cpu().numpy() # Forward pass through the actor network
File "C:\Users\$STG000-RQUJF6OTH79G\.conda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\META_ori_local.py", line 78, in forward
actions = torch.tanh(self.fc3(x_clone)) # TensorFlow: a = tf.keras.layers.Dense(self.a_dim, activation=tf.nn.tanh, name='a', trainable=trainable)(net2)
File "C:\Users\$STG000-RQUJF6OTH79G\.conda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\$STG000-RQUJF6OTH79G\.conda\envs\pytorch\lib\site-packages\torch\nn\modules\linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "C:\Users\$STG000-RQUJF6OTH79G\.conda\envs\pytorch\lib\site-packages\torch\nn\functional.py", line 1848, in linear
return torch._C._nn.linear(input, weight, bias)
(function _print_stack)
Traceback (most recent call last):
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\main2.py", line 843, in <module>
train(on_group = ON_GROUP) #train(scenario_list=train_set)
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\main2.py", line 258, in train
rl.learn(t_slot)
File "C:\Users\$STG000-RQUJF6OTH79G\Desktop\meta2-pytorch\meta_v9\Attention_DDPG_ori_local.py", line 301, in learn
critic_loss.backward()
File "C:\Users\$STG000-RQUJF6OTH79G\.conda\envs\pytorch\lib\site-packages\torch\_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "C:\Users\$STG000-RQUJF6OTH79G\.conda\envs\pytorch\lib\site-packages\torch\autograd\__init__.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 3]], which is output 0 of AsStridedBackward0, is at version 13; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
I’m not sure where the in-place problem occurs, but I think the actor is causing the error.
So, below is my code of class Actor.
class Actor(nn.Module):
def __init__(self, state_dim, action_dim, action_bound, dropout_rate):
super(Actor, self).__init__()
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Actor network architecture
self.fc1 = nn.Linear(state_dim, 64)
self.fc2 = nn.Linear(64, 64)
self.fc3 = nn.Linear(64, action_dim)
self.action_bound = action_bound
self.dropout = nn.Dropout(p = dropout_rate)
def forward(self, state):
state = state.to(self.device)
# Forward pass of the actor network
x1 = torch.relu(self.fc1(state))
x1_drop = self.dropout(x1)
x2 = torch.relu(self.fc2(x1_drop)
x2_drop = self.dropout(x2)
actions = torch.tanh(self.fc3(x2_drop))
return actions * self.action_bound
This problem has bothered me for several days, I hope it can be solved.
Best regards,
Ning