SAC doesn't converge in gym Mountain Car environment
|
|
1
|
187
|
October 22, 2024
|
Issue with training policy networks using PPO
|
|
3
|
106
|
October 22, 2024
|
Significiant time difference between minor model architecutre change
|
|
1
|
27
|
October 15, 2024
|
Gymnasium Single Frame Render with TorchRL
|
|
1
|
67
|
October 15, 2024
|
OpenXExperienceReplay fails
|
|
1
|
101
|
October 15, 2024
|
Issues with PPO Tutorial and Custom Dictionary Observation Space
|
|
1
|
122
|
October 12, 2024
|
DDPG Tutorial and Custom Environment
|
|
0
|
81
|
October 11, 2024
|
Deep Active Inference: Issues with NaN predictions
|
|
1
|
422
|
October 2, 2024
|
Creating custom MARL env in torchrl
|
|
3
|
1133
|
October 2, 2024
|
PPO and DDPG with Mujoco input frames
|
|
0
|
68
|
September 26, 2024
|
Multi Agent Reinforcement Learning A2C with LSTM, CNN, FC Layers, Graph Attention Networks
|
|
0
|
172
|
September 24, 2024
|
PPO for Discrete Action Spaces (CartPole)
|
|
2
|
293
|
September 23, 2024
|
Environments from scratch with Torchrl
|
|
11
|
1108
|
June 29, 2024
|
What is the exact format of the input TensorDict for ClipPPOLoss's forward method?
|
|
2
|
47
|
September 19, 2024
|
How do I free system RAM when from_pixels=True in SyncDataCollector?
|
|
4
|
38
|
September 10, 2024
|
RewardSum in custom multi agent env duplicating dimension
|
|
1
|
136
|
September 10, 2024
|
Feature Request: Consistent Dropout Implementation
|
|
4
|
556
|
September 10, 2024
|
Why is my algorithm not learning?
|
|
0
|
148
|
July 29, 2024
|
Leveraging half-precision training in PPO and Transformer-XL
|
|
0
|
95
|
September 2, 2024
|
Seeking a compatible library / package to calculate second derivative using gpu and PyTorch
|
|
2
|
27
|
August 31, 2024
|
ValueError: The shape of the spec and the CompositeSpec mismatch during shape resetting: the 1 first dimensions should match but got self['accuracy'].shape=torch.Size([1, 1]) and CompositeSpec.shape=torch.Size([1])
|
|
1
|
35
|
August 23, 2024
|
How to use DataLoader for ReplayBuffer
|
|
8
|
4183
|
August 10, 2024
|
Getting the "One of the variables needed for gradient computation has been modified by an inplace operation" Error while implementing PPO with a shared Module between actor and critic
|
|
1
|
88
|
July 21, 2024
|
Saving TensorDictModule
|
|
2
|
195
|
July 19, 2024
|
Batch size in Rollout
|
|
1
|
248
|
July 2, 2024
|
GymWrapper observation spec
|
|
2
|
198
|
June 29, 2024
|
How to remove zero padding when splitting a collector trajectory in the PPO tutorial?
|
|
5
|
244
|
June 28, 2024
|
Custom env from gymnasium
|
|
1
|
783
|
June 28, 2024
|
Custom policy with distributions for PPO
|
|
1
|
187
|
June 28, 2024
|
Ppo+lstm working code
|
|
5
|
3948
|
June 28, 2024
|