A correct C89/C90/C99/C11/C18 parser written using Menhir and OCaml
The operation and design of this parser are described in detail in the following journal paper (in process of review for publication):
A simple, possibly correct LR parser for C11
Jacques-Henri Jourdan and François Pottier
In order to build it, you can just type
The executable that is produced takes a preprocessed C file in its standard input and raises an exception in the case of a parse error.
The following command-line options are available: -
Sets which grammar to use.
c90tells the parser to use the old grammar, where declaration were not required to have a type specifier, in which case "int" was used (it still recognizes C99, C11 and C18 constructs).
c18use the new, simpler grammar: Declarations are required to have a type specifier, and the scoping rules are different.
Use the C99/C11/C18 scoping rules even though the old C89/C90 grammar is used. This is always set when using the new grammar.
The C18 standard forbids the use of an opening parenthesis immediately following an atomic type qualifier. This is intended to avoid a possible ambiguity with _Atomic used in a type specifier. This parser disambiguates this apparent conflict so that this restriction can be lifted safely.
If you want to use this parser in a C front-end, you should fill the semantic actions of .mly files with your own code for building your AST: - The file
parser.mlycontains a C99/C11/C18 compliant parser. It mostly follows the grammar of the C18 standard. - The file
parser_ansi_compatible.mlyis compliant with C89, C99, C11 and C18 (depending on the options given in
options.ml). It is significantly more complex than
We provide, in the
tests/directory, a series of tests that are particularly difficult to handle in a correct C parser. They are all valid C18 fragments, except for: - The files whose name end with
atomic_parenthesis.c, which represents an unnecessary restriction in the syntax presented in the C18 standard.
In order to run the test suite, you need the cram tool, available on most major linux distributions. Then, simply type