- 20 actors with 1 learner.
 - Tensorflow implementation with 
distributed tensorflowof server-client architecture. Recurrent Experience Replay in Distributed Reinforcement Learningis implemented in Breakout-Deterministic-v4 with POMDP(Observation not provided with 20% probability)
opencv-python
gym[atari]
tensorboardX
tensorflow==1.14.0
- Asynchronous Methods for Deep Reinforcement Learning
 - IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
 - DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
 - Recurrent Experience Replay in Distributed Reinforcement Learning
 
- A3C: Asynchronous Methods for Deep Reinforcement Learning
 
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 19
- Ape-x: DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
 
python train_apex.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 19
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
 
python train_impala.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 19
- R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning
 
python train_r2d2.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 39
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
 - DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
 - Recurrent Experience Replay in Distributed Reinforcement Learning
 - deepmind/scalable_agent
 - google-research/seed-rl
 - Asynchronous_Advatnage_Actor_Critic
 - Relational_Deep_Reinforcement_Learning
 - Deep Recurrent Q-Learning for Partially Observable MDPs