Greetings everyone! I am happy to announce that my RL library, Machin, designed for PyTorch, is close to its first public debut after several months of hard development!
Machin is designed with the elegant torch style in mind, while aiming to cover most of the functions provided by Ray. It is hosted at GitHub - iffiX/machin: Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...
Currently, Machin is at this development stage:
alpha-1: laying down a framework for all code
alpha-2: finish testing all modules
alpha-3: clean and update old examples
alpha-3.5: add architecture docs, tutorial docs, etc.
beta-1: public tests, collect response and perfect design < Machin is at here
first official release!
Algorithms
Machin is able to support a variety of RL algorithms, including:
Single agent algorithms:
- Deep Q-Network (DQN)
- Double DQN
- Dueling DQN
- RAINBOW
- Deep Deterministic policy Gradient (DDPG)
- Twin Delayed DDPG (TD3)
- Hystereric DDPG (Modified from Hys-DQN)
- Advantage Actor-Critic (A2C)
- Proximal Policy Optimization (PPO)
- Soft Actor Critic (SAC)
Multi-agent algorithms:
Massively parallel algorithms:
Enhancements:
- Prioritized Experience Replay (PER)
- Generalized Advantage Estimation (GAE)
- Recurrent networks in DQN, etc.
Algorithms to be supported:
- Generative Adversarial Imitation Learning (GAIL)
- Evolution Strategies
- Linear Optimization Methods
- QMIX (multi agent)
- Model-based methods
Parallel
Machin is capable of:
- Distributed & asynchronus training: provide implementations for parameter servers and gradient reduction servers.
- Advance RPC: Role based launching, service registraction, resource sharing. constructed on the basis of torch RPC.
- Process level Parallelization, based on enhanced Processes, Threads, Pools, Events.
- Model level Parallelization, Splitting(Not yet implemented) and Assignning model shards to devices, by heuristic.
Utilities
- Model checking toolset: check for nan values, abnormal gradients etc in your model, based on hooks.
- Logging videos, images, produced during your training
- Model serialization, auto loading and management toolset. No longer worry about unmatched devices!
- Many more.