Implementation of Distributed Reinforcement Learning with Tensorflow

Information

20 actors with 1 learner.
Tensorflow implementation with distributed tensorflow of server-client architecture.
Recurrent Experience Replay in Distributed Reinforcement Learning is implemented in Breakout-Deterministic-v4 with POMDP(Observation not provided with 20% probability)

Dependency

opencv-python
gym[atari]
tensorboardX
tensorflow==1.14.0

Implementation

How to Run

A3C: Asynchronous Methods for Deep Reinforcement Learning

CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0

CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 19

Ape-x: DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY

python train_apex.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 19

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

python train_impala.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 19

R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning

python train_r2d2.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 39

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
agent		agent
distributed_queue		distributed_queue
model		model
optimizer		optimizer
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.json		config.json
train_a3c.py		train_a3c.py
train_apex.py		train_apex.py
train_impala.py		train_impala.py
train_r2d2.py		train_r2d2.py
utils.py		utils.py
wrappers.py		wrappers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Implementation of Distributed Reinforcement Learning with Tensorflow

Information

Dependency

Implementation

How to Run

Reference

About

Uh oh!

Releases

Packages

Languages

chagmgang/distributed_reinforcement_learning

Folders and files

Latest commit

History

Repository files navigation

Implementation of Distributed Reinforcement Learning with Tensorflow

Information

Dependency

Implementation

How to Run

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages