Short video of the problem: Godot RL Agents.m4v - Google Drive
Hello, I am a newcomer who just started using PyTorch and Godot RL Agents yesterday for the first time following the steps of this 10-min tutorial (https://youtu.be/f8arMv_rtUU?si=sqsQw0FW2nJAmXkd&t=355), but I am stuck at minute 5:55, when executing the training; it only works for like 5 seconds, then it crashes and closes itself, leaving these errors in the log:
(rl) caferino@caferino:~/Documents/Python Envs$ python3 stable_baselines3_example.py
No game binary has been provided, please press PLAY in the Godot editor
waiting for remote GODOT connection on port 11008
connection established
action space {‘move’: {‘size’: 2, ‘action_type’: ‘continuous’}}
observation space {‘obs’: {‘size’: [0], ‘space’: ‘box’}}
Using cuda device
/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn(“Initializing zero-element tensors is a no-op”)
Logging to logs/sb3/experiment_13
/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/torch/nn/modules/linear.py:114: UserWarning: An output with one or more elements was resized since it had shape [1, 64], which does not match the required output shape [64]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at …/aten/src/ATen/native/Resize.cpp:28.)
return F.linear(input, self.weight, self.bias)
| time/ | |
| fps | 6 |
| iterations | 1 |
| time_elapsed | 5 |
| total_timesteps | 32 |
/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/torch/nn/modules/linear.py:114: UserWarning: An output with one or more elements was resized since it had shape [32, 64], which does not match the required output shape [64]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at …/aten/src/ATen/native/Resize.cpp:28.)
return F.linear(input, self.weight, self.bias)
/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py:248: UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([32])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
value_loss = F.mse_loss(rollout_data.returns, values_pred)
Traceback (most recent call last):
File “/home/caferino/Documents/Python Envs/stable_baselines3_example.py”, line 215, in
model.learn(**learn_arguments)
File “/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py”, line 315, in learn
return super().learn(
File “/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py”, line 299, in learn
self.train()
File “/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py”, line 279, in train
loss.backward()
File “/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/torch/_tensor.py”, line 492, in backward
torch.autograd.backward(
File “/home/caferino/Documents/Python Envs/rl/lib/python3.10/site-packages/torch/autograd/init.py”, line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: mat2 must be a matrix
exit was not clean, using atexit to close env
close message sent
I am still not experienced with the tools, but I have been troubleshooting numerous things: Resize.cpp doesn’t show up anywhere, only Resize.h; trying to find where to change mm to matmul; using batch_size = 64 instead of 32 in ppo.py to avoid a “mini-batch” problem I had too; trying to find where to use t.resize_(0) although I read .view or .reshape is more recommended to use; been in the documentation learning new concepts about the software, but still no luck anything regarding modifying Tensors or mat2 properly…
I don’t know what I am doing yet tbh, but I want to learn, I followed the tutorial as is as an introduction (I did notice tho, at minute 1:28, the file being used, “stable_baselines3_example.py” has 75 LoC less than the current version (updated recently 4 days ago), so maybe I need to configure something not shown in the video, or my laptop is simply a mess, it has given me a lot of problems before for having dualboot or something sketchy in my files…, but I will keep searching for the solution and share it if I find it.
My laptop neofetch:
caferino@caferino
OS: Pop!_OS 22.04 LTS x86_64
Host: HP ENVY m7 Notebook
Kernel: 6.6.10-76060610-generic
Uptime: 1 day, 20 hours, 54 minutes
Packages: 3698 (dpkg), 44 (flatpak)
Shell: bash 5.1.16
Resolution: 1280x720, 1920x1080
DE: GNOME 42.5
WM: Mutter
WM Theme: Pop
Theme: Pop-dark [GTK2/3]
Icons: Pop [GTK2/3]
Terminal: tilix
CPU: Intel i7-7500U (4) @ 3.00GHz
GPU: Intel HD Graphics 620
GPU: NVIDIA GeForce 940MX
Memory: 6999MiB / 15880MiB