RuntimeError: copy_if failed to synchronize: device-side assert triggered

I’m getting the following errors with my code. It is an adapted version of the PyTorch DQN example.

/pytorch/aten/src/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [62,0,0] Assertion `indexValue >= 0 && indexValue < src.sizes[dim]` failed.
Traceback (most recent call last):
  File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 548, in <module>
    optimize_model()
  File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 451, in optimize_model
    next_state_values[non_final_mask] = target_net(non_final_next_states).max(1)[0].detach()
RuntimeError: copy_if failed to synchronize: device-side assert triggered

The hyperparameters are as follows:

I ran with device=cpu to debug the error, and the error is asserted at line 443:

  File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 443, in optimize_model
    state_action_values = policy_net(state_batch).gather(1, action_batch)
RuntimeError: Invalid index in gather at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:457

@ptrblck would you happen to know how I can fix this?

The default cartpole example had 2 actions.

    else:
        return torch.tensor([[random.randrange(2)]], device=device, dtype=torch.long)

I noticed that it crashes, when I update it for my task environment, which has 7 actions.

N_ACTIONS=7, in this case. If i set it to 2, there is no crash.

Why would this be an issue? I can’t seem to locate any other part of the code that hard-codes the total number of actions.

Could you print the shape of policy_net(state_batch) and the min and max values of action_batch?
Some indices are apparently out of bounds for the gather operation.

@ptrblck Here is the output.

print("policy_net(state_batch).shape: {}".format(policy_net(state_batch).shape))
print("state_batch.shape: {}".format(state_batch.shape))
print("action_batch: shape= {}, max= {}, min= {}".format(action_batch.shape, action_batch.max(), action_batch.min()))
state_action_values = policy_net(state_batch).gather(1, action_batch)

output:

policy_net(state_batch).shape: torch.Size([128, 2])
state_batch.shape: torch.Size([128, 3, 180, 320])
action_batch: shape= torch.Size([128, 1]), max= 6, min= 0

I have a total of 7 actions. Action values 0 to 6 are mapped for the following drone movements: FORWARDS, BACKWARDS, STRAFE_LEFT, ;STRAFE_RIGHT, UP, DOWN, STOP.

There are a total of 8 observations. x, y, z ,r, p, y, sonar_value, collision

I’m running it on the cpu, to debug, and it gives the following error:

  File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 446, in optimize_model
    state_action_values = policy_net(state_batch).gather(1, action_batch)
RuntimeError: Invalid index in gather at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:457

@ptrblck After viewing the shape of the policy_net(state_batch), it would appear that the number of outputs were hard-coded to 2.

@ptrblck I’ve submitted a pull request with updates to the reinforcement_q_learning.py tutorial. I’ve made the DQN network accept the number of outputs and updated the example to obtain the number of actions from the gym environment action space. This will help avoid similar issues for others who my try the DQN example with different gym environments.

1 Like

Yeah, it looks like the hard-coded number of outputs creates this issue. Thanks for the PR and the fix! :slight_smile: