I am quite new to Reinforcement Learning and can’t understand it. I am unable to update configurations for the batch data using PPO.
I am using my custom-defined GYM environment, and want to train it using PPO and my external data which I’m loading in the form of torch DataLoader.
I am using Python 3.11 and Ray 2.40.0. Following is the relevant code:
import ray
from ray.rllib.algorithms.ppo import PPOConfig
from ray.tune.registry import register_env
from torch.utils.data import DataLoader
train_dataset = MultimodalDataset(
csv_file=config.TRAIN_CSV_PATH, max_images=config.MAX_IMAGES_RL
)
train_loader = DataLoader(train_dataset, batch_size=config.BATCH_SIZE, shuffle=True)
# Define PPO configuration
ppo_config = (
PPOConfig()
.training(gamma=0.9, lr=0.01)
.environment(env="MultimodalSummarizationEnv", env_config=default_env_config)
.framework("torch")
.resources(num_gpus=0, num_cpus_per_worker=1)
)
# Create PPO trainer
trainer = ppo_config.build()
# Function to update worker environments
def update_env_config_and_reset(worker, new_env_config):
worker.foreach_env(lambda env: env.reset(env_config=new_env_config))
# Training loop
for batch_idx, batch in enumerate(train_loader):
# Prepare batch-specific env_config
new_env_config = {
# new data for the batch_idx
}
# Update and reset environments for all workers
trainer.workers.foreach_worker(
lambda worker: update_env_config_and_reset(worker, new_env_config)
)
# Train PPO
result = trainer.train()
ray.shutdown()
However, when running the code I get the error on foreach_worker as follows:
‘function’ object has no attribute ‘foreach_worker’
Please help me identify where am I getting it wrong.