Need help with AI-Toolbox?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

Svalorzen
486 Stars 67 Forks GNU General Public License v3.0 1.5K Commits 7 Opened issues

Description

A C++ framework for MDPs and POMDPs with Python bindings

Services available

!
?

Need anything else?

Contributors list

AI-Toolbox

Library overview video

Build Status

This C++ toolbox is aimed at representing and solving common AI problems, implementing an easy-to-use interface which should be hopefully extensible to many problems, while keeping code readable.

Current development includes MDPs, POMDPs and related algorithms. This toolbox was originally developed taking inspiration from the Matlab

MDPToolbox
, which you can find here, and from the
pomdp-solve
software written by A. R. Cassandra, which you can find here.

If you use this toolbox for research, please consider citing our JMLR article:

@article{JMLR:v21:18-402,
  author  = {Eugenio Bargiacchi and Diederik M. Roijers and Ann Now\'{e}},
  title   = {AI-Toolbox: A C++ library for Reinforcement Learning and Planning (with Python Bindings)},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {102},
  pages   = {1-12},
  url     = {http://jmlr.org/papers/v21/18-402.html}
}

Description

This toolbox provides implementations of several reinforcement learning (RL) and planning algorithms. An excellent introduction to the basics can be found freely online in this book.

The implemented algorithms can be applied in several settings: single agent environments, multi agent, multi objective, competitive, cooperative, partially observable and so on. We strive to maintain a consistent interface throughout all domains for ease of use. The toolbox is actively developed and used in research.

Implementations are kept as simple as possible and with relatively few options compared to other libraries; we believe that this makes the code easier to read and modify to best suit your needs.

Please note that the API may change over time (although most things at this point are stable) since as the toolbox grows I may decide to alter it to improve overall consistency.

Documentation

The latest documentation is available here. Keep in mind that it may not always be 100% up to date with the latest commits, while the one you compile yourself will of course be.

For Python docs you can find them by typing

help(AIToolbox)
from the interpreter. It should show the exported API for each class, along with any differences in input/output.

Features

Cassandra POMDP Format Parsing

Cassandra's POMDP format is a type of text file that contains a definition of an MDP or POMDP model. You can find some examples here. While it is absolutely not necessary to use this format, and you can define models via code, we do parse a reasonable subset of Cassandra's POMDP format, which allows to reuse already defined problems with this library. Here's the docs on that.

Python 2 and 3 Bindings!

The user interface of the library is pretty much the same with Python than what you would get by using simply C++. See the

examples
folder to see just how much Python and C++ code resemble each other. Since Python does not allow templates, the classes are binded with as many instantiations as possible.

Additionally, the library allows the usage of native Python generative models (where you don't need to specify the transition and reward functions, you only sample next state and reward). This allows for example to directly use OpenAI gym environments with minimal code writing.

That said, if you need to customize a specific implementation to make it perform better on your specific use-cases, or if you want to try something completely new, you will have to use C++.

Utilities

The library has an extensive set of utilities which would be too long to enumerate here. In particular, we have utilities for combinatorics, polytopes, linear programming, sampling and distributions, automated statistics, belief updating, many data structures, logging, seeding and much more.

Bandit/Normal Games:

| | Policies | | | :---------------------------------------------------: | :----------------------------------------------: | :------------------: | | Exploring Selfish Reinforcement Learning (ESRL) | Q-Greedy Policy | Softmax Policy | | Linear Reward Penalty | Thompson Sampling (Student-t distribution) | Random Policy |

Single Agent MDP/Stochastic Games:

| | Models | | | :-----------------------------------: | :--------------------------------------------------------: | :---------------------------------------------: | | Basic Model | Sparse Model | Maximum Likelihood Model | | Sparse Maximum Likelihood Model | Thompson Model (Dirichlet + Student-t distributions) | | | | Algorithms | | | Dyna-Q | Dyna2 | Expected SARSA | | Hysteretic Q-Learning | Importance Sampling | Linear Programming | | Monte Carlo Tree Search (MCTS) | Policy Evaluation | Policy Iteration | | Prioritized Sweeping | Q-Learning | Double Q-Learning | | Q(λ) | R-Learning | SARSA(λ) | | SARSA | Retrace(λ) | Tree Backup(λ) | | Value Iteration | | | | | Policies | | | Basic Policy | Epsilon-Greedy Policy | Softmax Policy | | Q-Greedy Policy | PGA-APP | Win or Learn Fast Policy Iteration (WoLF) |

Single Agent POMDP:

| | Models | | | :------------------------: | :-----------------------------------------: | :--------------------------------------: | | Basic Model | Sparse Model | | | | Algorithms | | | Augmented MDP (AMDP) | Blind Strategies | Fast Informed Bound | | GapMin | Incremental Pruning | Linear Support | | PERSEUS | POMCP with UCB1 | Point Based Value Iteration (PBVI) | | QMDP | Real-Time Belief State Search (RTBSS) | SARSOP | | Witness | rPOMCP | | | | Policies | | | Basic Policy | | |

Factored/Joint Multi-Agent:

Bandits:

Not in Python yet.

| | Algorithms | | | :----------------------------------------------------: | :----------------------------------------------------------: | :------------------------------------------------: | | Max-Plus | Multi-Objective Variable Elimination (MOVE) | Upper Confidence Variable Elimination (UCVE) | | Variable Elimination | | | | | Policies | | | Q-Greedy Policy | Random Policy | Learning with Linear Rewards (LLR) | | Multi-Agent Upper Confidence Exploration (MAUCE) | Multi-Agent Thompson-Sampling (Student-t distribution) | Single-Action Policy |

MDP:

Not in Python yet.

| | Models | | | :---------------------------: | :----------------------------------------: | :--------------------------------------------------------------------: | | Cooperative Basic Model | Cooperative Maximum Likelihood Model | Cooperative Thompson Model (Dirichlet + Student-t distributions) | | | Algorithms | | | FactoredLP | Multi Agent Linear Programming | Joint Action Learners | | Sparse Cooperative Q-Learning | Cooperative Prioritized Sweeping | | | | Policies | | | All Bandit Policies | Epsilon-Greedy Policy | Q-Greedy Policy |

Build Instructions

Dependencies

To build the library you need:

In addition, full C++17 support is now required (this means at least g++-7)

Building

Once you have all required dependencies, you can simply execute the following commands from the project's main folder:

mkdir build
cd build/
cmake ..
make

cmake
can be called with a series of flags in order to customize the output, if building everything is not desirable. The following flags are available:
CMAKE_BUILD_TYPE # Defines the build type
MAKE_ALL         # Builds all there is to build in the project
MAKE_LIB         # Builds the whole core C++ libraries (MDP, POMDP, etc..)
MAKE_MDP         # Builds only the core C++ MDP library
MAKE_FMDP        # Builds only the core C++ Factored/Multi-Agent and MDP libraries
MAKE_POMDP       # Builds only the core C++ POMDP and MDP libraries
MAKE_TESTS       # Builds the library's tests for the compiled core libraries
MAKE_EXAMPLES    # Builds the library's examples using the compiled core libraries
MAKE_PYTHON      # Builds Python bindings for the compiled core libraries
PYTHON_VERSION   # Selects the Python version you want (2 or 3). If not
                 # specified, we try to guess based on your default interpreter.

These flags can be combined as needed. For example:

# Will build MDP and MDP Python 3 bindings
cmake -DCMAKE_BUILD_TYPE=Debug -DMAKE_MDP=1 -DMAKE_PYTHON=1 -DPYTHON_VERSION=3 ..

The default flags when nothing is specified are

MAKE_ALL
and
CMAKE_BUILD_TYPE=Release
.

The static library files will be available directly in the build directory. Three separate libraries are built:

AIToolboxMDP
,
AIToolboxPOMDP
and
AIToolboxFMDP
. In case you want to link against either the POMDP library or the Factored MDP library, you will also need to link against the MDP one, since both of them use MDP functionality.

A number of small tests are included which you can find in the

test/
folder. You can execute them after building the project using the following command directly from the
build
directory, just after you finish
make
:
ctest

The tests also offer a brief introduction for the framework, waiting for a more complete descriptive write-up. Only the tests for the parts of the library that you compiled are going to be built.

To compile the library's documentation you need Doxygen. To use it it is sufficient to execute the following command from the project's root folder:

doxygen

After that the documentation will be generated into an

html
folder in the main directory.

Compiling a Program

To compile a program that uses this library, simply link it against the compiled libraries you need, and possibly to the

lp_solve
libraries (if using POMDP or FMDP).

Please note that since both POMDP and FMDP libraries rely on the MDP code, you MUST specify those libraries before the MDP library when linking, otherwise it may result in

undefined reference
errors. The POMDP and Factored MDP libraries are not currently dependent on each other so their order does not matter.

For Python, you just need to import the

AIToolbox.so
module, and you'll be able to use the classes as exported to Python. All classes are documented, and you can run in the Python CLI
help(AIToolbox.MDP)
help(AIToolbox.POMDP)

to see the documentation for each specific class.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.