Need help with awesome-machine-learning-in-compilers?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

323 Stars 53 Forks Creative Commons Zero v1.0 Universal 155 Commits 0 Opened issues


Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

Services available


Need anything else?

Contributors list

Awesome machine learning for compilers and program optimisation

Awesome Maintenance

A curated list of awesome research papers, datasets, and tools for applying machine learning techniques to compilers and program optimisation.




Iterative Compilation and Compiler Option Tuning

Instruction-level Optimisation

Auto-tuning and Design Space Exploration

Parallelism Mapping and Task Scheduling

Domain-specific Optimisation

Languages and Compilation

Code Size Reduction

Cost and Performance Models

Learning Program Representation

Enabling ML in Compilers

Memory/Cache Modeling/Analysis

  • 10-pages Learning Memory Access Patterns - Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, Parthasarathy Ranganathan. ICML 2018


Talks and Tutorials


  • CompilerGym - reinforcement learning environments for compiler optimizations (paper).
  • CodeBert - pre-trained DNN models for programming languages (paper).
  • programl - LLVM and XLA IR program representation for machine learning (paper).
  • NeuroVectorizer - Using deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas (paper).
  • TVM - Open Deep Learning Compiler Stack for cpu, gpu and specialized accelerators (paper; slides).
  • clgen - Benchmark generator using LSTMs (paper; slides).
  • COBAYN - Compiler Autotuning using BNs (paper).
  • OpenTuner - Framework for building domain-specific multi-objective program autotuners (paper; slides)
  • ONNX-MLIR - Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure (paper).

Benchmarks and Datasets

  • The Alberta Workloads for the SPEC CPU® 2017 Benchmark Suite - Additional workloads for the SPEC CPU2017 Benchmark Suite.
  • Project CodeNet - Code samples written in 50+ programming languages, annotated with info, such as code size, memory footprint, CPU run time, and status (acceptance/error types)
  • CodeXGLUE - A Machine Learning Benchmark Dataset for Code Understanding and Generation (paper)
  • ANGHABENCH - A suite with One Million Compilable C Benchmarks (paper)
  • BHive - A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models (paper).
  • cBench - 32 C benchmarks with datasets and driver scripts.
  • PolyBench - Dataset - Multiple datasets for Polybench (paper)
  • PolyBench - 31 Stencil and Linear-algebra benchmarks with datasets and driver scripts.
  • PolyBench - Original - 30 Stencil and Linear-algebra benchmarks with datasets and driver scripts.
  • DeepDataFlow - 469k LLVM-IR files and 8.6B data-flow analysis labels for classification (paper).
  • devmap - 650 OpenCL benchmark features and CPU/GPU classification labels (paper; slides).



How to Contribute

See Contribution Guidelines. TL;DR: send one of the maintainers a pull request.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.