Need help with atari?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

gsurma
200 Stars 83 Forks MIT License 42 Commits 9 Opened issues

Description

AI research environment for the Atari 2600 games 🤖.

Services available

!
?

Need anything else?

Contributors list

# 50,283
Keras
ml
space-i...
jupyter
41 commits

Atari

Research Playground built on top of OpenAI's Atari Gym, prepared for implementing various Reinforcement Learning algorithms.

It can emulate any of the following games:

['Asterix', 'Asteroids', 'MsPacman', 'Kaboom', 'BankHeist', 'Kangaroo', 'Skiing', 'FishingDerby', 'Krull', 'Berzerk', 'Tutankham', 'Zaxxon', 'Venture', 'Riverraid', 'Centipede', 'Adventure', 'BeamRider', 'CrazyClimber', 'TimePilot', 'Carnival', 'Tennis', 'Seaquest', 'Bowling', 'SpaceInvaders', 'Freeway', 'YarsRevenge', 'RoadRunner', 'JourneyEscape', 'WizardOfWor', 'Gopher', 'Breakout', 'StarGunner', 'Atlantis', 'DoubleDunk', 'Hero', 'BattleZone', 'Solaris', 'UpNDown', 'Frostbite', 'KungFuMaster', 'Pooyan', 'Pitfall', 'MontezumaRevenge', 'PrivateEye', 'AirRaid', 'Amidar', 'Robotank', 'DemonAttack', 'Defender', 'NameThisGame', 'Phoenix', 'Gravitar', 'ElevatorAction', 'Pong', 'VideoPinball', 'IceHockey', 'Boxing', 'Assault', 'Alien', 'Qbert', 'Enduro', 'ChopperCommand', 'Jamesbond']

Check out corresponding Medium article: Atari - Reinforcement Learning in depth 🤖 (Part 1: DDQN)

Purpose

The ultimate goal of this project is to implement and compare various RL approaches with atari games as a common denominator.

Usage

  1. Clone the repo.
  2. Go to the project's root folder.
  3. Install required packages
    pip install -r requirements.txt
    .
  4. Launch atari. I recommend starting with help command to see all available modes
    python atari.py --help
    .

DDQN

Hyperparameters

* GAMMA = 0.99
* MEMORY_SIZE = 900000
* BATCH_SIZE = 32
* TRAINING_FREQUENCY = 4
* TARGET_NETWORK_UPDATE_FREQUENCY = 40000
* MODEL_PERSISTENCE_UPDATE_FREQUENCY = 10000
* REPLAY_START_SIZE = 50000
* EXPLORATION_MAX = 1.0
* EXPLORATION_MIN = 0.1
* EXPLORATION_TEST = 0.02
* EXPLORATION_STEPS = 850000

Model Architecture

Deep Convolutional Neural Network by DeepMind

* Conv2D (None, 32, 20, 20)
* Conv2D (None, 64, 9, 9)
* Conv2D (None, 64, 7, 7)
* Flatten (None, 3136)
* Dense (None, 512)
* Dense (None, 4)

Trainable params: 1,686,180

Performance

After 5M of steps (~40h on Tesla K80 GPU or ~90h on 2.9 GHz Intel i7 Quad-Core CPU):

SpaceInvaders

Training:

Normalized score - each reward clipped to (-1, 1)

Testing:

Human average: ~372

DDQN average: ~479 (128%)


Breakout

Training:

Normalized score - each reward clipped to (-1, 1)

Testing:

Human average: ~28

DDQN average: ~62 (221%)

Genetic Evolution

Atlantis

Training:

Normalized score - each reward clipped to (-1, 1)

Testing:

Human average: ~29,000

GE average: 31,000 (106%)

Author

Greg (Grzegorz) Surma

PORTFOLIO

GITHUB

BLOG

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.