Running the code you want to write as fast as the code you have to write
A prototype compiler for running the code you want to write as fast as the code you have to write (as long as it's numerical).
By which I mean that DVL is a language and compiler for writing the numerical kernel of your system in, instead of Fortran or C. The DVL input language is much nicer and higher level, a dialect of Lisp; but more important, high level DVL constructs compile to efficient code.
Why yet another compiler? Isn't it already possible to get within 3-5x of Fortran with a little annotation of Common Lisp or Haskell? Yes, but only by writing Fortran in parentheses (or in monads). I know of no compiler where modularity is costless (except research systems along the lines of DVL).
It's not costless in DVL either, but in DVL you pay for it with resource-intensive compilations rather than slow runtimes. In the numeric context, all the modularity constructs you know and love: data structures, closures, higher-order functions, abstraction boundaries of every stripe -- even automatic differentiation -- are just scaffolding for carrying numbers hither and yon to be added and multiplied. DVL solves the scaffolding at compile time and emits efficient code to do the arithmetic at runtime.
Disclaimer: DVL is a prototype. The language is minimal (but you don't need much to predict the positions of the planets); the error messages are inscrutable; and the commandline interface is inflexible. DVL is not curretnly polished enough to be really used by someone who does not have time to understand its internals. You have been warned.
DVL is written primarily in MIT Scheme and distributed in source form, so you will need to install that first. If you want to use any other backend besides compiling through MIT Scheme, you will need to install that system. In particular, the Haskell backend, in addition to generating Haskell, is written in Haskell, so you will need GHC twice for that.
When ready, just
git clonethis repository and then
git submodule initand
git submodule updateto pull in code dependencies. Then
and you're off to the races. Or fire up your MIT Scheme and
to use DVL as a library.
DVL is a prototype. The purpose of making it public is experimentation, collaboration, and elucidation. You are not expected to be able to use it without considerable source diving -- but the source is intended to be legible.
That said, read
dvl/READMEto get started, or the
dvlshell script to see the entry points.
The code tree is organized as follows:
fol/Implementation of FOL (intermediate target for VL and DVL)
vl/Implementation of VL (uses fol/)
dvl/Implementation of DVL (some refers to vl/; uses fol/)
support/pattern-case/Schemely pattern-matching macro (submodule)
support/rules/Rule-based term-rewriting system (submodule)
testing/Unit test framework (submodule)
*/test/Unit tests for *
*/load.scmScheme entry point for *
*/READMEEntry point for * documentation
fol/folCommandline entry point for FOL
vl/vlCommandline entry point for VL
dvl/dvlCommandline entry point for DVL
DVL consists of three main pieces:
In reverse order,
3) FOL is the intermediate language for VL and DVL. Instead of being entirely internal, FOL is a standalone compiler for a statically typed first-order language with Scheme syntax (which makes it easy to generate programmatically). The FOL compiler is dominated by a FOL to FOL optimizer, which performs inlining, scalar replacement of aggregates, dead code elimination, common subexpression elimination, and some algebraic simplification. Optimized FOL can then be compiled to linkable libraries or executables via any of several backends.
2) VL is a compiler implementing union-free polyvariant flow analysis on a (almost) functional source language. It generates FOL, and therefore does not itself contain any post-analysis optimization.
1) DVL is a variation on VL extended to be able to reason about gensyms (unique values); DVL the language adds the primitives GENSYM and GENSYM= to VL. Gensyms are sufficient to implement forward mode AD correctly by operator overloading; the flow analysis suffices to migrate all the boxing, tagging, dispatching, and unboxing that this normally entails to compile time. DVL also adds a rudimentary foreign interface.
DVL is based on Stalingrad, which was developed by Professors Jeffrey Mark Siskind at Purdue University and Barak A. Pearlmutter at the Hamilton Institute, with work beginning at least as early as 2004.
Stalingrad is an explicitly automatic-differentiation-aware compiler for scientific computing, building on Professor Siskind's earlier work on the Stalin compiler for Scheme. The source language Stalingrad compiles is called VLAD, for Vunctional Language with Automatic Differentiation -- a language that, true to its name, is purely functional (except for a minimal i/o facility), and provides built-in operators for invoking AD.
I had the privilege of having a post-doctoral appointment with Professor Pearlmutter between the fall of 2010 and the summer of 2012, working on compilation for numerical computing and automatic differentiation. During that time, I rewrote the non-AD portion of VLAD in a system called, appropriately enough, VL (included in the
vl/directory) as a more expository implementation of the compilation technique at the heart of Stalingrad.
After VL, I found myself taking this same compilation method in a slightly different direction than VLAD and Stalingrad. I noticed that it essentially still worked even in the presence of the slight language impurity of being able to generate unique objects -- which is sufficient for an implementation of AD in user space, without having to increase the (already considerable) complexity of the compiler with it. Thus was born the DysVunctional Language.
--Alexey Radul, Boston, 2013