Hi,
After running a DQN network for several 100s of episodes, I get a CUDA out of memory error.
Traceback (most recent call last):
File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 533, in <module>
optimize_model()
File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 436, in optimize_model
next_state_values[non_final_mask] = target_net(non_final_next_states).max(1)[0].detach()
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/project/ros-kinetic-alphapilot/catkin_ws/src/alphapilot_openai_ros/ardrone_race_track/src/ardrone_v1_ddqn.py", line 285, in forward
x = F.relu(self.bn2(self.conv2(x)))
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 862, in relu
result = torch.relu(input)
RuntimeError: CUDA out of memory. Tried to allocate 155.12 MiB (GPU 0; 11.75 GiB total capacity; 8.74 GiB already allocated; 163.25 MiB free; 455.51 MiB cached)
Which parts of an NN model or other variables should I look at, to see where the leak is?